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For  many  years,  land  development  in  the  coastal 
regions  of  the  Gulf  of  Mexico  and  the  eastern  seaboard  has 
continued  unabated.  As  coastal  populations  increase,  it 
is  becoming  more  and  more  difficult  to  evacuate  people 
from  hurricane-threatened  areas  and  to  secure  industrial 
plants.  Greater  accuracy  is  required  in  predicting 
hurricane  landfall  in  order  to  insure  timely  evacuation. 

A  significant  result  of  this  research  is  the 
classification  of  past  storms  by  time  series  stationarity 
category  which  relates  to  direction  of  movement.  Also,  a 
psi-weight  representation  of  the  forecast  is  used  to 
develop  a  bivariate  Normal  confidence  ellipse  for  the 
threshold  autoregressive  model. 


IV 


It  is  shown  that  the  landfall  of  North  Atlantic 
hurricanes  and  tropical  storms  can  be  accurately  predicted 
by  modeling  the  storm  track  as  a  bivariate  (latitude  and 
longitude)  fifth-order  autoregressive  process.  A  thres¬ 
hold  approach  is  used  to  allow  model  parameters  to  change 
as  the  storm  moves  to  a  new  region  of  the  ocean.  For  test 
cases,  operational  average  72  hour  prediction  error  is  at 
least  three  standard  deviations  below  the  average  error  of 
the  official  forecasts  issued  by  the  National  Hurricane 
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CHAPTER  I 


INTRODUCTION 

"On  Saturday,  September  8,  1900  a  major  hurricane 
struck  Galveston,  Texas,  resulting  in  the  death  of 
approximately  6000  persons"  (Carter,  1983).  Despite  the 
fact  that  mariner  warnings  had  been  posted  the  day  before, 
the  residents  of  Galveston  were  not  prepared  for  the 
hurricane . 

Since  that  time,  land  development  in  the  coastal 
regions  of  the  Gulf  of  Mexico  has  continued  unabated. 

Today  the  population  of  the  Houston-Gal veston  metropolitan 
area  is  over  two  million  people  and  property  is  worth 
billions  of  dollars.  As  coastal  populations  increase,  it 
is  becoming  more  and  more  difficult  to  evacuate  people 
from  hurricane-threatened  areas.  A  recent  study  indicates 
that  it  would  take  26  hours  to  evacuate  Galveston  Island 
and  that  the  evacuation  order  must  come  36  to  38  hours 
before  anticipated  hurricane  landfall  (Carter,  1983). 

When  hurricane  Alicia  came  ashore  at  Galveston  on 
August  17-18,  1983,  the  city  was  better  prepared.  Many 
people  had  evacuated,  yet  Alicia  still  killed  17  people 


and  injured  3000  (Barron  and  Allred,  1983).  The 
evacuation  order  for  hurricane  Alicia  came  at  10:00  AM  on 
August  17,  approximately  14  hours  before  landfall.  As  the 
storm  passed,  inland  freeways  were  still  choked  with  cars. 
It  was  amazing  that  the  loss  of  life  was  not  greater. 

Hurricane  scientists  are  well  aware  of  the 
evacuation  problem  and  of  the  need  for  more  accurate 
landfall  prediction.  A  recent  newspaper  article  stated, 
"With  such  enormous  growth  in  the  coastal  lands,  and  no 
real  improvement  in  hurricane  forecasting,  scientists  have 
concluded  they  can  no  longer  predict  the  landfall  of  the 
big  storms  in  time  to  guarantee  that  everyone  can  get  out 
alive"  (Calonius,  1983). 

Another  problem  that  should  be  addressed  here  is 
the  cost  of  securing  a  threatened  area.  Neumann  (1975) 
estimates  1975  protection  cost  at  25  million  dollars  for  a 
typical  300  mile  stretch  of  Gulf  of  Mexico  coastline.  He 
states  that  every  10  nautical  mile  increase  in  forecast 
error  increases  the  cost  by  5  million  dollars,  and  every 
10  nautical  mile  decrease  reduces  cost  by  2.75  million 
dollars  per  storm.  Costs  have  i  creased  since  then.  At  an 
annual  inflation  rate  of  8  percent,  the  corresponding 
total  cost  would  now  be  54  million  dollars. 

The  crux  of  the  problem  rests  with  the  inadequacy 
of  present  forecasting  procedures.  Currently  the  National 


Hurricane  Center  (NHC)  needs  approximately  2  hours  and  45 
minutes  to  develop  a  72  hour  forecast.  The  major  reason 
for  this  seemingly  lengthy  lead  time  is  that  the  NHC 
computer  is  not  colocated  with  the  forecasters.  The 
average  NHC  72  hour  forecast  error  is  435  nautical  miles 
(Carter,  1983).  This  large  error  sometimes  results  in  an 
inability  of  the  population  at  large  to  "suspend 
disbelief"  with  regard  to  that  forecast.  Thus,  city 
managers  are  left  with  difficult  assessments.  Should  they 
order  a  costly  evacuation  based  on  the  forecast,  or  should 
they  wait  and  hope  the  storm  misses  their  city? 

Through  the  use  of  past  hurricane  tracks,  Dr. 
William  G.  Lesso  has  developed  a  Markov  model  that  runs  in 
five  seconds  on  a  microcomputer  but  is  usually  less 
accurate  then  the  NHC  official  forecasts  (Freeze,  1983). 
Further  refinement  of  his  initial  research  has  lead  to  a 
time  series  model  that  runs  in  less  than  ten  seconds  on  a 
microcomputer  and  has  an  average  72  hour  forecast  error  of 
312  nautical  miles. 

The  time  series  model  is  developed  in  this  study. 
The  computer  models  currently  used  to  forecast  hurricanes 
are  discussed  in  Chapter  2.  Recent  developments  in 
nonlinear  time  series  modeling  are  outlined  in  Chapter  3. 
Chapter  4  includes  a  discussion  of,  stationarity  as  it 
relates  to  hurricane  tracking,  formulation  of  the  time 


series  model,  and  estimation  of  the  parameters.  The 
chapter  also  contains  several  sections  on  bivariate 
estimation  and  the  development  of  a  forecast  confidence 
interval  that  can  be  applied  to  the  threshold  model. 


CHAPTER  2 


PREDICTION  OF  TROPICAL  CYCLONE  MOTION 


The  major  United  States  agency  involved  in 
hurricane  movement  forecasting  today  is  the  National 
Hurricane  Center  (NHC)  in  Coral  Gables,  Florida.  The  NHC 
is  definitely  the  world  leader  in  hurricane  forecasting, 
and  the  state-of-the-art  is  represented  by  the  seven 
models  that  the  NHC  uses  to  analyze  storm  movement.  In 
this  chapter  the  forecast  methodologies  used  by  the  NHC, 
adaptations  of  NHC  models  used  by  other  agencies  world¬ 
wide,  and  one  of  the  models  developed  at  The  University  of 
Texas  at  Austin  by  Lesso  and  Freeze  are  discussed. 

When  an  active  hurricane  is  being  tracked,  the  NHC 
issues  forecasts  at  least  every  6  hours  and  predicts  storm 
movement  for  lead  times  up  to  72  hours.  These  forecasts 
are  based  partially  on  seven  computer  models:  NHC-67, 
SANBAR,  HURRAN,  CLIPER,  NHC-72,  NHC-73,  and  MFM.  The 
official  forecast  is  made  by  a  highly  skilled  and 
experienced  hurricane  forecaster.  The  forecaster  combines 
the  output  of  the  seven  models  with  data  from  prognostic 
charts  of  hemispheric  circulation,  examines  the  influence 
of  these  large  scale  features  on  the  motion  of  the  storm, 
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and  using  his  best  judgement  issues  a  forecast  (Carter, 
1983;  Freeze,  1983).  The  seven  models  are  discussed  in 
the  following  paragraphs. 

NHC-67  (Miller  et  al.,  1968)  is  classified  as  a 
statistical-synoptic  model.  The  term  synoptic  refers  to 
the  use  of  weather  data  covering  a  large  area  at  a 
particular  time.  NHC-67  is  based  on  the  steering 
principle  that  a  tropical  cyclone  moves  in  proportion  to 
the  vertically  integrated  flow  around  the  vortex. 
Specifically,  the  smoothed  500,  700,  and  1000  millibar 
(mb)  height  fields  are  used  in  conjunction  with  the 
thickness  of  the  500-700  millibar  (mb)  and  700-1000  mb 
pressure  bands.  The  measurements  are  used  to  grid  the 
pressure  differences  across  the  cyclone.  The  readings  are 
then  combined  with  previous  1000  mb,  700  mb,  and  500  mb 
readings  and  used  to  modify  an  initial  forecast  based  on 
climatology  and  persistence  (the  degree  of  steadiness  of 
movement).  A  stepwise  regression  is  performed  to  select 
pressure  levels  that  are  significantly  correlated  with 
future  zonal  (east/west)  and  meridonal  (north/south) 
components  of  motion  (Neumann  and  Pelissier,  1981).  The 
result  is  a  steering  vector  that  predicts  storm  movement 
and  velocity.  The  aforementioned  pressure  levels  are 
chosen  because  the  air  flows  into  the  storm  at  the  1000  mb 
(surface)  level,  the  pressure  gradient  is  approximately 


balanced  at  the  500  mb  level,  and  the  air  flows  out  of  the 


top  of  the  system  at  about  the  200  mb  level.  (This 
inverted  "bathtub  vortex"  then  spins  counterclockwise  due 
to  the  Coriolis  effect.)  Thus,  NHC-67  uses  time  dependent 
pressure  gradients  that  reflect  the  speed  of  development 
and  strength  of  the  storm.  This  model  is  comparatively 
accurate  for  forecast  times  of  24  hours  or  less. 

The  SANBAR  model  (Sanders  and  Burpee,  1968),  has 
been  in  use  at  NHC  since  1970.  It  is  a  barotropic  model 
that  predicts  storm  tracks  by  following  minimum  stream 
function  and  maximum  vorticity  centers  in  the  belief  that 
conservation  of  momentum  is  the  primary  physical  mechanism 
that  determines  the  motion  of  the  storm  (Neumann  and 
Pelissier,  1981).  The  general  assertion  is  that  the  storm 
is  steered  by  the  large  scale  current  in  which  it  is 
embedded.  The  method  uses  winds  averaged  with  respect  to 
mass  from  the  surface  to  the  100  mb  level  (approximately 
55,000  feet),  analyzes  the  wind  circulation  in  terms  of 
stream  functions,  and  predicts  displacement  of  the 
vorticity  maximum  value  (the  eye).  It  provides  good 
results  for  prediction  periods  longer  than  36  hours  and 
works  best  in  the  tropics  (Barney,  1983).  SANBAR  requires 
computer  facilities  which  can  rapidly  process  large 
amounts  of  data. 


HURRAN  (HURRicane  ANalog)  is  an  analog  model  based 
on  the  historical  observation  that  hurricane  tracks  tend 
to  be  repetitive  (Hope  and  Neumann,  1970).  Families  of 
storms  are  identified  by  location,  motion,  and  time  of 
year.  Selected  tracks  are  moved  to  a  common  origin  (the 
current  position  of  the  existing  storm)  and  combined  with 
persistence  to  produce  a  forecast  with  assumed  bivariate 
normal  error.  The  centers  of  the  confidence  ellipses  are 
used  to  identify  the  most  likely  track.  HURRAN  is  a  good 
predictor  of  westward  movement,  but  fails  accurately  to 
model  recurvature  (the  usual  northward  turn  a  hurricane 
makes  when  exiting  the  middle  latitudes)  (Freeze,  1983). 
The  Navy  has  adapted  the  analog  techniques  of  HURRAN  to 
other  tropical  cyclone  basins  by  using  a  weighting  scheme 
that  gives  the  most  weight  to  the  most  similiar  storms. 
Adaptations  of  HURRAN  are  also  used  by  India,  Australia, 
Nationalist  China,  and  the  Peoples'  Republic  of  China 
(Hope  and  Neumann,  1977). 

CLIPER  (CLImatology  and  PERsistence)  is  a  regres¬ 
sion  model  that  was  developed  to  overcome  the  problems 
encountered  by  HURRAN  when  no  similar  storm  tracks  exist 
(Neumann,  1968).  Based  on  location,  motion,  and  forecast 
period,  CLIPER  mathematically  recreates  past  storm  tracks, 
and  applies  the  same  predictor  equations  to  the  current 
storm.  Thus  it  has  the  advantage  of  always  being  able  to 
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provide  a  forecast,  even  under  unusual  weather  conditions, 
but  it  is  not  reliable  in  predicting  northward  movement. 
Interestingly,  it  consistently  outperforms  HURRAN. 
Consequently,  CLIPER  is  frequently  used  as  a  benchmark  for 
comparing  the  accuracy  of  more  sophisticated  models 
(Neumann  and  Pelissier,  1981).  The  term  "CLIPER-class 
model"  is  now  used  to  refer  to  a  wide  range  of  models  that 
employ  climatology  and  persistence,  such  as  the  model 
developed  in  this  study. 

NHC-72  (Neumann  et  al.,  1972)  generates  two 
independent  sets  of  forecasts.  One  forecast  uses  the 
1000,  700,  and  500  mb  pressure  data,  without  persistence, 
and  the  other  uses  the  CLIPER  equations.  The  final 
equations  are  developed  through  regression  techniques  used 
to  combine  the  two  forecasts.  In  general  the  model  is  a 
combination  of  the  NHC-67  and  CLIPER  methodologies.  NHC- 
72  provides  a  better  model  of  northward  movement,  but  is 
weak  at  low  latitudes  and  in  the  westernmost  parts  of  the 
Caribbean  and  is  not  reliable  after  recurvature  (Neumann 
and  Pelissier,  1981). 

NHC-73  (Neumann  and  Lawrence,  1975)  is  a 
combination  of  the  NHC-72  and  CLIPER  systems.  It  is 
similar  to  NHC-72  in  that  it  combines  regression  equations 
based  on  synoptic  data  with  those  of  the  CLIPER  model. 
However,  it  also  selects  predictors  from  the  U.  S. 
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National  Meteorological  Center's  500  mb  Primitive  Equation 
Model  geopotential  height  prognosis.  NHC-73  is  one  of  the 
most  accurate  models  at  the  NHC,  especially  for  long 
period  forecasts. 

The  Moveable  Fine  Mesh  (MFM)  grid  model  is 
considered  to  be  one  of  the  most  sophisticated  and  complex 
dynamic  models  in  use  at  NHC  (Hope  and  Neumann,  1977). 

The  grid  follows  the  hurricanes  as  they  move.  It  is 
generally  the  same  as  other  primitive  equation  models,  but 
has  finer  resolution  and  covers  less  area.  It  is  a  ten 
layer  model  with  a  horizontal  grid  mesh  length  of  60 
kilometers  and  covers  an  area  of  9  million  square 
kilometers  (Neumann  and  Pelissier,  1981).  This  model  is 
also  used  for  precipitation  prediction.  Its  results 
compare  favorably  with  the  other  models. 

The  National  Hurricane  Center  is  definitely  the 
world  leader  in  prediction  models.  Hope  states  that  the 
analog,  regression,  and  synoptic  models  developed  by  other 
meteorological  centers  are  generally  adaptations  of  the 
NHC  models  (Hope  and  Neumann,  1977).  There  are  two 
exceptions:  (1)  the  Indian  Meteorological  Service  has 

developed  its  own  models  using  analog  techniques;  and  (2) 
the  Royal  Observatory  in  Hong  Kong  has  developed  an 
empirical  model  that  combines  climatology  and  persistence 
with  equal  weighting.  The  latter  model  combines  the  last 


twelve  hours  of  movement  with  historical  directions  and 
speeds  computed  for  each  2.5  degree  latitude- longitude 
square.  It  is  quite  accurate  where  there  is  a  high 
frequency  of  occurrence,  but  less  reliable  above  25 
degrees  north  latitude  due  to  recurvature.  The  techniques 
used  to  develop  this  model  are  similar  those  which  were 
employed  in  this  study. 

Another  approach  has  been  pursued  in  one  of 
several  hurricane  movement  models  developed  by  Lesso  and 
Freeze.  The  model  (Model  A)  is  based  on  a  Markov  process 
that  uses  historical  hurricane  tracks  to  predict  future 
tracks  (Freeze,  1983).  In  its  simplest  form  a  Markov 
process  represents  the  probabilities  of  movement  along  a 
line.  At  each  discrete  time  increment  there  is  a 
probability  P  of  an  object  moving  in  one  direction  and 
probability  1-P  of  moving  in  the  opposite  direction.  At 
each  future  time  increment  the  expected  position  of  the 
object  can  be  calculated.  The  analogy  extends  to  two, 
three,  or  more  dimensions.  The  distinguishing  feature  of 
the  process  is  one  of  being  in  a  discernable  state  which 
can  easily  be  represented  by  the  state  variables. 

Movement  to  the  future  state  depends  only  on  the  current 
state.  With  respect  to  hurricanes,  the  state  is  the 
position  of  the  storm  at  a  particular  six  hour  position 
report.  Future  movement  is  hypothesized  to  depend  only  on 


current  position  and  not  on  the  path  the  hurricane  took  to 
get  to  that  position. 

In  developing  their  model,  Lesso  and  Freeze 
analyzed  hurricanes  that  occurred  in  the  Gulf  of  Mexico 
and  the  north  Atlantic  Ocean.  In  Model  A  they  divided  the 
region  into  latitude  bands  five  degrees  in  width.  Based 
on  the  most  frequent  past  movements  in  each  band  they 
discovered,  via  least  squares  regression,  that  the  next 
change  in  longitude  and  latitude  was  described  by  the 
following  equations: 

DX  =  -1.24969  +  .001499  (P0(IP,1))2 
DY  =  -.045835  +  .022926  ( PO ( IP, 1) )  . 

PO ( I P , 1 )  was  the  current  latitude  of  the  storm.  DX  was 
the  forecast  change  in  longitude,  and  DY  was  the  forecast 
change  in  latitude.  These  quantities  and  the  cumulative 
forecast  error  were  added  to  the  last  position  to  obtain 
the  forecast  position.  Standard  errors  of  the  coeffi¬ 
cients  and  the  forecasts  were  not  discussed.  Freeze 
(1983)  considered  only  the  average  forecast  error. 

In  order  to  improve  the  forecasts,  Lesso  and  Freeze 
regressed  the  mean  error  for  each  forecast  (6  hr.,  12  hr., 
...  ,  72  hr.)  against  the  forecast  period  and  discovered 
they  were  linearly  related.  Standard  deviations  of  the 
estimates  were  not  discussed.  Specifically  the  correction 


for  latitude  is  given  by 


Y  =  -.35  +  . 2  9  X 

and  for  longitude 

Y  =  -.  32  +  .  3  6X 

where  X  represents  the  forecast  period  (X=l  is  the  6  hr. 
forecast,  X  =  2  is  the  12  hr.  forecast,  etc.)  and  Y  is  the 
correction.  Seeking  an  even  better  correction  scheme  they 
found,  by  trial  and  error,  that  the  latitude  correction 
should  be  multiplied  by  a  factor  of  three. 

Error  analysis  of  five  historical  hurricane  tracks 
shows  that  Model  A  is  slightly  less  accurate  than  the 
offical  forecasts  issued  by  the  National  Hurricane  Center. 
The  comparative  data  for  hurricanes  Frederic  (August, 
1979),  Dennis  (August,  1981),  Allen  (July,  1980),  Floyd 
(September,  1981)  and  Gert  (September,  1981)  are  presented 
in  Table  2.1  on  the  following  page. 


TABLE  2.1 


FORECAST  MEAN  ERROR  DISTANCE 


Hurricane  Forecast 

Error  (Nautical  Miles) 

Forecast 

Name 

24  1 

8R  48  HR 

72  HR 

Source 

Frederic 

91 

188 

286 

Model  A 

69 

143 

218 

NHC 

Dennis 

99 

261 

406 

Model  A 

92 

207 

364 

NHC 

Allen 

100 

213 

325 

Model  A 

173 

353 

589 

NHC 

Floyd 

115 

272 

558 

Model  A 

93 

234 

408 

NHC 

Gert 

133 

372 

859 

Model  A 

136 

236 

455 

NHC 

Average 

108 

261 

487 

Model  A 

113 

235 

407 

NHC 

Sources:  Freeze 

(1983)  , 

Hebert  et 

al.,  (1980)  , 

Staff, 

NHC  (1982),  Taylor 

et  al.,  (1981) 

CHAPTER  3 


NONLINEAR  TIME  SERIES 

Many  processes  occurring  in  nature,  and  in  a  vari¬ 
ety  of  engineering  fields,  exhibit  behavior  that  can  not 
be  adequately  represented  by  a  linear  time  series.  This 
has  resulted  in  significant  interest  in  developing  non¬ 
linear  time  series  models  {Haggan  et  al.,  1984).  Priestly 
(1980)  discusses  a  general  class  of  nonlinear  time  series 
models  called  "state  dependent  models"  (SDM ) .  In  the  SDM 
approach,  current  values  of  the  coefficients  depend  on 
previous  values  of  the  time  series.  When  considering  the 
location  prediction  of  a  moving  target  one  would  like  the 
flexibility  of  allowing  the  model  parameters  to  vary  over 
location.  Thus,  it  would  be  desirable  to  utilize  a  model 
which  exhibits  properties  of  the  SDM  models. 

In  this  chapter  univariate  and  bivariate  cases  of 
the  state  dependent  model  are  considered.  The  univariate 
Box-Jenkins  Autoregressive  Moving  Average  (ARMA)  model 
is  given  as  a  basis  for  comparison.  Then  the  general  SDM 
approach,  the  bilinear  model,  the  exponential  autoregres¬ 
sive  model,  and  the  threshold  autoregressive  (AR)  model 
for  both  the  linear  and  nonlinear  cases  are  presented. 
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Estimation  of  SDM  parameters  is  also  discussed.  In  the 
bivariate  section  the  "open-loop  threshold  autoregressive 
system,"  which  is  very  similar  to  the  approach  used  in  this 
research,  is  discussed  (Tong  and  Lim,  1980). 

The  Linear  ARMA  Model 

The  linear  ARMA  (k,£)  model  is  given  by 

Xt  +  ^lxt-l  +  **’  +  ^xt-k  =  u  +  et  +  6iet-l  +’**  + 

where  et  is  a  sequence  of  zero-mean  random  error  terms  and 

01 ,  ...  •••  » 4>  are  constants.  The  objective  is 

to  predict  Xt  based  on  previous  observations  and  random 
inputs . 

The  General  SDM 

The  SDM  is  an  extension  of  a  linear  ARMA  time 
series  model  to  the  case  where  a  process  C X 1 3  can  be 
represented  by  a  nonlinear  model  whose  behavior  may  be 
approximated  locally  by  a  linear  ARMA  time  series  model 
(Haggan  et  al. ,  1984).  (The  term  "locally"  implies  small 
departures  of  the  model  from  its  current  state.)  This 
leads  to  the  general  model 

Xt  +  $ i  ( k t_ i^xt  —  1  +  ^ 2  ^ xt- 1  ^  X t-2  +  *  ^k  (xt-l  ^  Xt-k 

=  U  ( x  f  i )  +  et  +  ®  ]_  ( xt- 1 )  e  t- 1  +  - +  0£(xt-l)et-£  •  (3-D 

That  is,  the  coefficients  depend  on  the  state  vector  xfc  of 

the  process  in  the  previous  time  period. 


There  are  benefits  and  disadvantages  to  using 
nonlinear  SDM  models  to  describe  hurricane  movement.  The 
advantage  is  that  parameter  values  can  change  as  the  storm 
moves.  This  should  result  in  forecasts  with  less  error 
than  forecasts  produced  by  linear  models.  Unfortunately, 
the  increased  accuracy  comes  at  a  price.  Identification 
procedures  for  SDM  models  are  not  well  established.  To 
circumvent  this  problem,  researchers  typically  fit  "all 
orders"  of  a  particular  model  and  choose  the  model  that 
fits  the  "best"  according  to  some  predetermined  measure. 
Tong  and  Lim  (1980)  propose  a  procedure  that  is  outlined 
later  in  the  chapter.  In  addition,  computation  times  are 
sometimes  large,  and  there  can  be  convergence  problems 
when  estimating  the  parameter  values. 

The  hurricane  model  used  in  this  research  is  a 
piecewise  linearization  of  the  hurricane  movement  process. 
Parameter  values  are  allowed  to  change  only  when  the  storm 
crosses  a  "threshold"  and  enters  a  new  region  of  the  North 
Atlantic.  Thus,  most  of  the  usual  time  series  identifica¬ 
tion,  estimation,  and  forecasting  procedures  still  apply 
within  each  region.  The  segmenting  of  the  North  Atlantic 
ocean  into  several  regions  increases  the  computation  time 
and  requires  the  availability  of  large  numbers  of 
hurricane  position  reports. 
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Criteria  (AIC)  (Akaike,  1973).  AIC  is  defined  as 

AIC  =  N  In (RSS/N)  +  2k 

where  RSS  is  the  residual  sum  of  squares,  N  is  the 
number  of  observations  and  k  is  the  number  of  independent 
parameters.  Parameter  values  are  then  computed  using  a 
nonlinear  optimization  procedure,  such  as  Newton-Raphson, 
to  minimize  the  RSS.  Starting  values  for  the  AR  portion 
of  the  model  are  obtained  by  fitting  an  AR(p)  model.  The 
remaining  coefficients  are  initiated  at  zero.  An  alterna 
tive  starting  procedure  is  to  take  initial  estimates  from 
the  model  of  order  BL(p,0,p,q-l)  or  BL (p-1 ,0  ,p-l  ,q)  which 
ever  has  the  smallest  RSS. 

Lee  (1985)  used  a  biliner  model  to  predict  sea 
state  processes.  He  tentatively  identified  the  order  of 
the  autoregressive  and  moving  average  components  by  using 
the  usual  identification  procedures  for  linear  models. 

The  order  of  the  cross  product  term  was  identified  by 
selecting  the  model  that  minimized  the  AIC  statistic. 

The  bilinear  model  was  considered  for  use  in 
modeling  hurricane  tracks.  However,  it  was  discarded 
when  a  comparatively  simple  additive  autoregressive 
threshold  model  was  found  to  accurately  describe  storm 


movement . 


2 


Exponential  Autoregressive  Model 

This  class  of  models  is  formed  by  assuming  u=0, 
eu  =  0  ,  for  a  1 1  u  and 

<Mxt-l)  =  +  *u  exp[-YX2t_1] 

(Haggan  et  al.,  1984).  Then  (3.1)  yields 

Xt+(i))1+ir1  exp(-X2t_1 3 )  Xt_1  +  ...+ (  1^+ ir^  expC-X2t_1])Xt_k  =  e  t. 
Ozaki  (1980)  has  used  this  model  to  analyze  self-sustained 
oscillations  of  springs  and  nonlinear  oscillations  in 
electric  circuits.  Model  identification  was  not  discussed. 

Haggan  and  Ozaki  (1981)  describe  a  procedure  for 
estimating  the  order  k,  and  the  coefficients 
(y ir^)?  i  =  l,...,k).  First  y  is  fixed  at  particular 
grid  values  and  Xt  is  regressed  against  exp(-yX2t_1 3XS 
(s  <  t)  and  against  previous  Xt  values,  from  order  k=l  up 
through  arbitrary  order  m.  Then  the  model  that  minimizes 
the  AIC  for  that  y  is  selected.  The  AIC  statistic  can 
also  be  compared  across  y  values  to  select  the  "best" 
model  overall. 

This  model  was  not  used  in  this  research  because 
the  small  size  of  the  segmented  hurricane  tracks  made  it 
desirable  to  limit  the  number  of  model  parameters.  The 
bivariate  threshold  autoregressive  model  that  was  finally 
developed  had  one-half  the  number  of  parameters  as 
compared  to  the  exponential  autoregressive  model. 
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Threshold  AR  Model 

The  idea  underlying  the  threshold  AR  model  is  the 
piecewise  linearization  of  a  nonlinear  model  of  the  state 
space  by  introduction  of  "thresholds."  The  models  are 
then  locally  linear  (Tong  and  Lim,  1980).  For  this  model 
it  is  assumed  that  e„  =  0  for  all  u  and 


y(xt_i)  = 


>u(xt-l>  = 


The  resulting  model  is 


U!  if  Xt_d  £  c 


^ 2  if  xt-d  >  c 
xt.d  s  c 


iP’if  Xt.d  >  c 


Xt  +  <f>  i  Xt-i  +  ’*’+<t,k^xt-k  =  y  1  +  et  if  xt-d  “  c 
Xt  +  <^i  (2)  X^__jl  +**•+  <(>k  ^  ^  X  t-k  =  V  2  +et  if  X  >  c 

(Haggan  et  al.,  1984). 

Tong  and  Lim  (1980)  used  this  model  to  study  the 
nonlinear  aspects  of  "jump  resonance"  and  "amplitude- 
frequency  dependency."  Jump  resonance  is  the  sudden 
change  in  output  amplitude  that  occurs  at  different  input 
frequencies  depending  on  whether  the  input  frequency  is 
increasing  or  decreasing.  Amplitude-frequency  dependency 


is  present  when  an  output  signal  has  different  frequencies 
of  oscillation  for  different  amplitudes.  The  estimation 
procedure  used  by  Tong  and  Lim  is  described  later  in  the 
chapter . 

The  threshold  AR  model  is  nearly  the  model  used 
in  this  research.  The  differences  are  that  the  hurricane 
model  is  bivariate,  and  the  parameter  values  depend  upon 
the  region  in  which  the  storm  is  located  rather  than 
depending  on  the  value  of  the  last  observation  Xt_-.. 

Nonlinear  Threshold  AR  Model 

This  model  is  a  modified  form  of  the  threshold  AR 
model  (Haggan  et  al.,  1984).  Lety  =  0,  0  u  =  0  for  all  u 
and  define 


xt  +  (H+n  I  xt-i !  )xt-i  +  ,**  +  4k+*k  I  xt-i  I  )xt-k  =  et  if  ,Xt_ll'c 

x  t +  <4*  1  +1f  j.  c )  X  t  - 1  c)Xt_k  =  et  if  I  X  t  -  i  I  >c 

This  model  has  the  flexibility  to  change  parameters  from 


period  to  period,  based  on  a  state  of  the  previous  period. 

This  type  of  model  provides  the  framework  for  the 


model  used  m  this  research.  However,  parameter  values 
depend  on  the  previous  observation  xt_^  which  is  not 
desirable.  For  the  hurricane  process  it  is  belived  that 
the  climatology  of  the  storms  changes  slowly.  Piecewise 
linearization  enforces  this  assumption  by  allowing  the 
forecast  parameters  to  remain  constant  until  the  next 
threshold  is  crossed. 


Estimation  of  SDM  Parameters 
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for  all  u,  where  X't+1  =  Xt+1~Xt- 

The  gradients  are  unknowns  that  must  be  estimated. 
The  basic  strategy  is  to  allow  the  gradients  to  follow  a 
random  walk  which  can  be  represented  in  matrix 
form  by 


where  Bfc  =  (a^,  g  , 
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(t) 
l  ' 


and  Vt  is  a  sequence  of  independent  matrix-valued  random 
variables  distributed  as  multivariate  Normal  with  zero 
means.  For  each  t,  the  procedure  determines  tlose  values 
of  Bt  which  minimize  the  discrepancy  between  the  observed 
value  of  Xt  and  its  predicted  value  from  the  model.  This 
sequential  algorithm  resembles  the  procedure  used  in  the 
Kalman  filter  algorithm  (Haggan  et  al.,  1984). 


Open  Loop  Threshold  Autoregressive  System 

Tong  and  Lim  (1980)  developed  a  more  general 
representation  of  the  threshold  autoregressive  (TAR) 
model.  One  of  their  models,  the  open  loop  threshold 
autoregressive  system  (TARSO) ,  is  nearly  identical  to  the 
hurricane  forecasting  model.  There  are  two  major 
differences  between  the  TARSO  model  and  the  model  used  in 
this  study.  First,  Tong  and  Lim  require  pairwise  indepen¬ 
dence  between  all  white  noise  sequences.  The  white  noise 
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sequences  in  the  hurricane  model  are  allowed  to  have  a 
contemporaneous  covariance  structure.  Second,  the  TARSO 
model  requires  some  past  observation  to  be  in  a  particular 
region  before  switching  parameters.  In  the  hurricane 
model,  the  parameters  change  when  the  forecast  crosses 
into  a  new  region. 

Tong  and  Lim  (1980)  represent  the  TARSO  model  as 
m  m 1 

Xn  =  4  +  aijxn-i  +  £  bijYn-i  +  ^nj 

i=l  i=0 

where  C Xt 3  is  the  output  series,  (Yt}  is  the  input  series, 
and  the  coefficients  aQ,  a^  and  b^  are  dependent  on  the 
value  of  Yn_d  (some  previous  input).  en  is  a  white  noise 
sequence  with  zero  mean  and  finite  variance  and  is 
independent  of  y  .  The  values  of  the  nonlinear  series  are 
assigned  to  j  nonoverlapping  intervals  by  percentiles  of  Y 
(taken  on  the  observed  range).  When  Yn_d  crosses  a  per¬ 
centile  boundary,  the  coefficients  are  allowed  to  change. 

Tong  and  Lim  (1980)  use  an  estimation  procedure 
in  which  the  data  are  divided  into  two  regions,  those 
values  falling  above  a  percentile  breakpoint  t  ,  and  those 
falling  below.  In  each  region,  AR  models  are  fit  up  to  a 
maximum  (arbitrary)  order.  The  models  that  minimize  the 
A IC  in  each  region  are  selected.  Let  the  order  of  the 


models  be  k1  and  k2*  Then  one  can  write 

AIC(t  )  =  AIC  ( k-^ )  +  AIC  ( k 2 ) 

Next,  t  is  allowed  to  vary  over  a  preselected  set  of 
tq's.  The  value  of  t^  that  minimizes  the  AIC  for  the 
applicable  k^  and  k2  order  models  is  selected  as  the 
threshold  value.  Then  d  (for  Yn_^  above)  is  varied  over  a 
preselected  set  of  integers.  The  d  that  minimizes  the  AIC 
for  the  values  of  t  ,  k^,  and  k2  is  selected  as  the  appro¬ 
priate  lag. 

Tong  and  Lim  also  develop  the  eventual  forecast 
function.  Their  forecasts  take  the  form  of  an  oscillatory 
series  of  constant  period.  This  is  due  to  the  fact  that, 
unlike  a  linear  Box-Jenkins  model,  a  stable  nonlinear 
system  will  continue  to  oscillate  (converge  to  a  limit 
cycle,  which  may  degenerate  to  a  constant)  after  termina¬ 
tion  of  input  (Tong  and  Lim,  1980).  To  analyze  forecast 
error,  Tong  and  Lim  delete  the  last  10  percent  of  the 
observations,  and  then  fit  a  new  model.  If  the  fitted 
model  using  the  complete  data  set  does  not  differ 
significantly  from  the  new  model,  then  the  original  model 
is  adopted  as  the  final  model.  Confidence  intervals  of 
the  forecast  were  not  discussed. 


CHAPTER  4 


MODEL  IDENTIFICATION  ESTIMATION  AND  FORECASTING 


The  procedure  used  to  model  the  hurricane  tracks 
is  outlined  in  this  chapter.  Stationarity  and  the 
resulting  identification  of  the  appropriate  autoregressive 
model  are  discussed.  Univariate  and  bivariate  models  are 
estimated,  and  models  are  checked  for  adequacy  and  then 
used  to  forecast  the  longitude  and  latitude  series  of 
actual  hurricane  tracks. 

Stationarity 

A  stochastic  process  is  strictly  stationary  if  its 
properties  are  unaffected  by  a  change  in  origin;  that  is, 
if  the  joint  probability  distribution  of  m  observations 
Z^,  Z  2 1  •••  »  Zm,  made  at  any  set  of  times  t^,  t-2>  •••  / 
tm  is  the  same  as  that  of  a  different  set  of  observations 

Z1+k,  ^2+kf  ***  '  ^  m  +  k  *  at  times  t  ^  +  k ,  ^■2+kf  ' 

tm+k,  for  any  integer  k  (Box  and  Jenkins,  1976). 

A  convenient  example  of  a  process  that  is  mean 
nonstationary  is  the  latitude  position  series  of  a 
hurricane  moving  due  north  at  constant  velocity.  Clearly 
the  value  of  latitude  increases  over  time.  Thus,  the 
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latitude  does  not  vary  about  a  constant  mean.  It  is  non¬ 
stationary.  However,  'he  longitude  is  constant  and  seems 
to  vary  about  a  constant  mean.  It  is  apparently  station¬ 
ary.  In  an  actual  hurricane  position  series,  nature 
supplies  random  shocks  that  cause  the  longitude  and 
latitude  to  vary  about  their  respective  means.  The 
stationarity  of  the  latitude  and  longitude  series  is 
important  and  is  discussed  in  detail  in  the  next  section. 

In  practical  applications  it  is  virtually 
impossible  to  test  for  strict  stationarity  (Granger  and 
Newbold,  1977).  Fortunately,  under  the  assumption  of  a 
Gaussian  process,  the  conditions  known  as  weak  stationar¬ 
ity  are  equivalent  to  strict  stationarity.  A  process  Cw^ 
is  weakly  stationary  (covariance  stationary)  if  over  time 
it  varies  about  a  constant  mean,  the  process  variance  re¬ 
mains  constant,  and  the  cov(wt,wt+k)  is  constant  for  all  t. 

In  order  to  obtain  a  consistent  and  unbiased 
estimate  of  the  population  mean  it  is  necessary  that  the 
process  be  ergodic.  Ergodicity  is  explicitly  defined  by 
Hannan  (1970).  Ergodicity  implies  that  observations  of 
the  process  sufficiently  far  apart  in  time  are 
uncorrelated,  so  that  when  averaging  a  growing  series 
through  time,  new  information  is  continually  added.  Then 


'm  =  (1/m)  Z  wt 

t  =  l 


(4.1) 
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can  be  used  to  estimate  the  mean  of  the  process. 

Similarly,  the  estimators  of  cov (wt , wt  +  k)  will  also 

be  consistent.  An  estimator  6  is  a  consistent  estimator 

of  0  if,  as  the  sample  size  increases,  it  approaches  0  in 

probability.  That  is  if 

lim  P[  | 0  —  0 [  £  e]  =  1  ,  for  any  e  >  0 

n-*-® 

The  cov (wt ,wt  +  k)  ,  the  autocovariance  at  lag  k,  is 
defined  by 

Yk  =  cov  [wt,wt  +  k]  =  E  [  (wt-p)  (wt  +  k-  p)  ]  (4.2) 

where  p  =  E  [wtJ.  Given  N  observations,  p  is  estimated  by 

N 

w  =  (1/N)  £  w.  .  (4.3) 

t  =  l 

The  autocorrelation  at  lag  k  is 

pk  =  y_k  =  E  [  (wt-p)  (wt4k-p)  ] 
y 0  E  [ (wt-y ) (wt  -  p  )  ] 

=  E  Hwt-P)(wt_k  -  v)]/°w2  *  ,4) 

Pk  is  estimated  by  r^  =  C]^/c0  where 
N-k 

ck  =  4  1  (wt~w) (wt+k-w)  •  (4.5) 

N  t  =  l 

The  ck  are  the  estimates  of  Yk*  In  order  to  compare  the 
ck  values,  °w  must  remain  constant  over  time.  Also  cQl 
the  estimator  of  the  process  variance,  ow^'  must  be 


constant  so  that  the  values  of  may  be  compared  at 
various  lags. 

In  modeling  hurricanes,  the  approach  used  in  this 
study  involved  dividing  the  North  Atlantic  region  into 
rectangular  grids  within  which  the  individual  hurricanes 
moved  in  a  similar  "pattern"  (i.e.  with  constant  velocity 
or  acceleration  during  a  period  of  p  observations  for  the 
AR(p)  model).  That  is,  the  tracks  were  divided  into 
approximately  homogeneous  segments.  In  order  to  analyze 
various  orders  of  AR  models,  it  was  necessary  to  develop  a 
procedure  to  determine  if  a  given  hurricane  track  was 
covariance  stationary.  Within  each  grid,  if  the  latitude 
and  longitude  series  (possibly  differenced)  are  individ¬ 
ually  covariance  stationary  then,  under  the  assumption 
that  the  bivariate  process  is  Normal,  the  bivariate 
process  is  stationary. 

Theorem  4.1: 

Definitions 

Let  Yt  and  Xt  be  weakly  stationary  univariate  series 
with  zero  mean. 

N  is  the  number  of  observations. 

p  is  the  order  of  the  AR  model. 

is  the  cross  correlation  E [ ] / ( a ^02)  . 

$11  i  is  the  autoregressive  coefficient  of  Xt-i  used 
to  predict  Xp 

$12  i  is  the  autoregressive  coefficient  of  Yp_i  used 
to  predict  Xp 


<fl2i, i  is  the  autoregressive  coefficient  of  Xt_j_  used 
to  predict  Yf 

^22  i  is  t^ie  autoregressive  coefficient  of  Yt_ used 
to  predict  Yf 

£1  and  £2  are  the  white  noise  series  for  X  and  Y, 

^  N(0,o]_2) 

£2  ^  N  (  0  ,  cj  2  2 ) 

and , 


cov<elt  e2s>  = 
Then  the  general 


bivariate  AR(p) 


t*s 


t  =  s 


model  is 


X  t  =  ,lJ  4*11 ,  ixt-i  +  ♦l2,iYt-i1  +  elt 
1  =  1 


Yt  =  l  [  <(,01  ;X*-_  :  + 

i  =  l 


21 , ixt-i  +  ^22  , iYt-i ^  +  E2t 


The  covariance  matrix  of  the  bivariate  process  is 


E  [  (X , Y)  '  ( X ,  Y)  ] 


o i  o1a2P 
axa2p  o  2 


where  X  and  Y  arc  column  vectors  of  length  N. 

To  prove  that  the  bivariate  process  is  weakly 
stationary,  it  must  be  shown  that  the  means  and  the 
covariance  matrix  of  X  and  Y  are  constant  over  time. 
Proof : 


Clearly  the  means  are  constant. 


It  is  given 


that  cr^2  and  a  are  constant  over  time.  Thus,  the 
proof  reduces  to  showing  that  (the  cross 
correlation  at  lag  k)  is  constant  over  time,  or 
EtXtYt_k]  =  c,  for  all  t,  where  p  +  k  <  t  <  N.  The 
use  of  lag  k  or  lag  -k  is  arbitrary.  For  stationari- 
ty,  both  pk  and  p_^  must  be  constant  over  time  (but 
not  necessarily  egual).  By  definition  of  stationari- 
ty,  E[XtXt_)<]  =  a±f  a  constant.  Then, 

E[Xtxt-kJ  =  *11 , 1E  f  Xt-lXt-k^  +  *11 , 2E  fXt-2Xt-K^  + 

+*11 ,pE  ^xt-pxt-k^ 

*^12, lEIYt~lxt-k] +*12 ,2ElYt-2xt-d + 
t'*12,pE'yt-pxt-kJ*E[£1txt-k]  '  al  I4-6* 

where  E[eltXt_k]  =  0  .  Also, 


E  ^ YtXt-k ^  =  *21 , 1E tXt-lXt-k^ +*21 , 2E ^xt-2Xt-kJ + 
+*21 ,pE [Xt-pXt-k] 

+  *2  2 , 1E  fYt-lxt-kJ  +  *22 ,2E  [yt-2Xt-k^  + 

^22,pEfYt-pXt-kJ+Efe2tXt-k3 

where  E[e2txt-k^  =  0  • 


(4.7) 


The  individual  expected  values  in  (4.6)  and  (4.7)  are 
equal.  Only  the  coefficients,  which  are  known 
scalars,  differ  in  the  two  equations.  From  (4.6)  it 


is  clear  that 
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n2,lE[Yt-lXt-kl+^12,2E[Yt-2Xt-kl  +  *" 
+*12/PEtYt-pxt-kl 

must  be  constant  over  t.  Thus,  in  (4.7),  the  quantity 


^22 , 1E  ^t-lxt-k^  +<^22 , 2E  ^Yt-2Xt-k^  +  * 

+  <^2  2  ,pE  ^xt-pxt-k^ 

must  also  equal  some  other  constant  for  all  t. 
Therefore,  E[YtXt_k]  -*-s  constant  over  t. 

It  is  easy  to  determine  if  a  univariate  process  is 
stationary.  For  the  AR(p)  process  let 


wt  =  <!’lwt-l  +  ***  +  ^>wt-p  +  at  a  N  ( 0  ,  a  a  2 ) 

An  equivalent  representation  using  the  backshift  operator 
B  is  given  by 

(i  “  -  ^pB^)  wt  =  at 


The  effect  of  B  is  to  shift  w^_  back  one  time  interim,  i.e. 
Bwfc  =  wt_-p  Now,  consider  the  AR(1)  process  where 


then 


(1  -  (f^B)  w^.  =  a^- 


wt  =  at/  (1  -  <#> XB) 

=  a^.  ( 1  +  <P  iB+4>  i^B^+  ...  ) 

=  at  +  $iat-i  +  <P  1 2at-2  +  ••• 


and  the  variance  of  w ^  is  given  by 

Var[wt]  =  var[at]  (1  +  <^12  +  ^14  +  ...  ) 


Var[wt]  (i-4i12)  =  Var[afc] 


(4.8) 


Clearly  if  these  variances  are  to  be  finite,  the  geometric 
series  in  $  must  converge  so  <j> <  1. 

Equivalently,  for  a  stationary  process  of  any 
order,  the  roots  of  the  <j>  (B)  polynomial  lie  outside  the 
unit  circle.  For  the  AR(1)  model  this  polynomial  is  given 
by 

♦  (B)  =  (l-^B) 

Regarding  B  as  a  variable,  if  4>  (B)  is  set  equal  to  zero 
and  solved  for  B,  for  stationarity  the  roots  of  the 
equation  B=l/<))1  must  be  greater  than  one.  In  general,  for 
the  AR(p)  process,  the  roots  of  the  <p  (B)  polynomial 

♦  (B)  =  (1  -  ♦  jB  -  <P2B2  -  -  <f>pBp) 

must  lie  outside  the  unit  circle  (Box  and  Jenkins,  1976). 
Hurricane  Stationarity 

Suppose  the  last  n  position  reports  of  a  hurricane 
(or  tropical  storm)  are  defined  by  the  ordered  pairs 

<LA  t-  i ,  LOt- 1 )  (LAt_2  ,L0t_2)  ...  (LAt_n  ,L0t_n)  .  The  set 
of  position  reports  are  two  time  series,  latitude  and 
longitude.  It  is  desired  to  forecast  (LAt,LOt)  as  a 
linear  combination  of  the  n  previous  position  reports. 


.V 


.-I , 


The  general  model  is 


LAt 

=  aii/iLAt_1+a11>2LAt_2+ 

+all,nLAt-n 

+a12,iLOt-i+ai2,2LOt-2+ 

+a12 ,nL0t-n 

L0t 

=  a2l,lLAt-l+a21,2LAt-2+ 

+a21,nLAt-n 

+a22,iLOt-l+a22,2LOt-2+ 

+a22 ,nLOt-n 

In  order  to  develop  a  model  of  the  latitude  series  LAfc_2, 
LA t_ 2 /  •••  r  LAt-n  and  longitude  series  L0t_2,  LOt_2r  •  ••» 
L0t_n,  it  is  necessary  that  the  series  be  weakly  station¬ 
ary.  Hurricane  series  representing  six  hour  position 
reports  are  too  short  for  nonconstant  variance  to  be 
detected,  but  often  are  nonstationary  in  their  means. 

While  it  would  seem  that  a  hurricane  which  is 
continually  in  motion  could  never  be  considered  to  be 
stationary,  this  is  not  typically  the  case.  If  a  storm  is 
moving  due  west  (W)  or  east  (E) ,  the  latitude  series 
remains  constant.  In  this  case  the  hurricane  is  latitude- 
position  stationary,  i.e.  the  time  series  LA^-_2' 

LAt_2,  ...  ,  LAt_n  varies  about  a  constant  mean.  A  storm 
moving  due  north  (N)  or  south  (S) ,  is  longitude-position 
stationary . 

If  the  storm  is  moving  northwest  (NW) ,  northeast 
(NE) ,  southwest  (SW) ,  or  southeast  (SE) ,  it  is  neither 
latitude-position  stationary  nor  longitude-position 
stationary.  When  this  occurs,  stationarity  can  be  induced 


by  differencing  (calculating  the  change  per  unit  time 
interval)  the  latitude  and  longitude  series.  If  the  new 
series  (which  now  represent  velocities)  vary  about  a 
constant  mean,  the  hurricane  is  said  to  be  latitude- 
velocity  and/or  longitude-velocity  stationary.  It 
describes  a  storm  moving  (say  NW)  at  constant  velocity  and 
is  used  to  predict  the  next  velocity,  i.e.  the  next  change 
in  position.  This  is  the  basis  of  the  model  developed  by 
Lesso  and  Freeze,  although  their  model  uses  the  previous 
latitude  to  predict  velocity  and  is  based  on  all  past 
storm  tracks  (Freeze,  1983).  If  only  the  previous 
velocity  were  used  to  predict  the  next  change  in  position, 
the  model  would  be  first-order  autoregressive  (AR1),  or 
equivalently,  a  Markov  random  walk  in  velocity. 

If  the  hurricane  is  accelerating  (say  in 
latitude) ,  the  latitude  series  must  be  differenced  twice 
to  induce  stationar ity.  The  process  of  determining 
whether  the  latitude  and  longitude  series  are  stationary 
in  position,  velocity,  or  acceleration,  results  in  nine 
possible  classifications  (categories)  for  u  particular 
track.  These  categories  have  a  useful  physical  interpre¬ 
tation  related  to  the  direction  of  motion  (Table  4.1). 


TABLE  4.1 


HURRICANE  STATIONARITY  CATEGORIES 


LATITUDE 

POSITION 

LONGITUDE 

VELOCITY 

ACCELERATION 

POSITION 

1 

Standing 

Still 

2 

Moving 

West  (East) 

3 

Accelerating 
West  (East) 

VELOCITY 

4  1 

Moving  j 

North  (South) 1 

5 

Moving 

NE,SE,SW,NW 

6 

Recurving 

N , S  to  E  ,  W 

ACCELERATION 

7 

Accelerating 
North  (South) 

8 

Recurving 

W , E  to  N , S 

9 

Accelerating 

NE,SE,SW,NW 

Hurricane  Model  Identification 

Once  stationarity  is  confirmed,  it  is  necessary  to 
determine  the  order  of  the  autoregressive  process.  The 
general  pth  order  bivariate  (latitude,  longitude)  model  is 
given  by 

LAfc  =  l  [*11 ,iLAt-i  +  ^12 , iLOt-i]  +  alt 
i  =  l 

P 

LOt  =  ^  ^21  ,iLAt-i  +  ^22 ,  iLOt-i^  +  a2t  * 

Tiao  and  Box  (1981)  developed  a  useful  identification 
procedure  for  such  multivariate  models.  The  process 
involves  fitting  AR  models  of  successively  higher  order 
and  examining  the  cross-correlation  and  partial  auto¬ 
regression  matrices  after  each  fit.  Let  r  (£ )  be  the 
lag  £  cross-covariance  matrix.  Then  for  the  bivariate 


process 

r(£)  =  jU)},  l  =  0,  ±1,  ±2,  ... 

i,j  =  1,  2 

and  p  ( £)  =  (  p^j  (  £)3  is  the  corresponding  cross-correlation 
matrix.  For  a  stationary  multivariate  AR(p)  process,  the 
auto  and  cross-correlations  decay  slowly  to  zero  as  the 
lag  increases.  The  pfc^  partial  autoregression  matrix 
P(p)  is  the  matrix  of  autoregressive  coefficients,  i.e.  at 
lag  p 

P(P)  = 

where 

?p  =  ( <J>i j  ^ pi  i,j  =  1,2 

For  a  stationary  AR(p)  process,  the  partial  autoregression 
matrix  is  zero  beyond  lag  p,and  the  estimates  $^,...,$p 
are  asymptotically  jointly  Normally  distributed.  The 
significance  of  the  parameters  may  be  tested.  (The  test 
results  can  be  represented  by  filling  the  partial  auto¬ 
regression  matrix  with  a  '+'  sign  if  the  coefficient  is 
more  than  two  standard  deviations  above  zero,  or  a 
sign  if  it  is  more  than  two  standard  deviations  below 
zero.)  Their  procedure  is  implemented  in  the  time  series 
package  by  Statistical  Computing  Associates  (SCA;  Liu  et 
al.,  1983). 

Unfortunately,  due  to  the  segmenting  of  hurricane 
tracks  and  the  resultant  missing  values,  SCA  could  not  be 


used.  Instead  Statistical  Package  for  the  Social  Sciences 
(SPSS;  Nie,  et  al,  1975)  was  employed.  The  Box  and  Tiao 
approach  was  adapted  to  the  partial  F  statistic  computed 
by  SPSS.  The  lagged  data  were  sequentially  regressed  and 
the  plus  and  minus  signs  were  assigned  based  on  a  90% 
confidence  interval  for  the  regression  coefficients. 

In  the  first  analysis,  all  velocity  lags  of 
latitude  and  longitude  were  considered  up  to  lag  5  (30 
hours  prior  to  the  current  velocity  observation).  As 
the  lag  increased,  a  gradual  decrease  in  the  auto  and 
cross-correlations  was  noted.  Significant  coefficients 
occurred  at  lags  1,  4,  and  5.  This  implied  that  the 
process  could  be  autoregressive  with  a  "cyclic"  component 
at  lag  4,  representing  a  24  hour  lag.  This  component 
could  physically  reflect  the  diurnal  effect  of  the  sun 
(the  slowing  of  the  storm  at  night). 

In  the  second  set  of  regressions  lags  1,  4,  and  5 
were  considered.  In  general,  the  lag  5  parameter  was 
weak,  so  it  was  dropped  from  the  model.  This  led  to  the 
general  model  with  velocity  coefficients  evaluated  at  lags 
1  and  4. 

Based  on  the  analysis  by  Lesso  and  Freeze,  it  was 
believed  that  the  model  coefficients  should  be  allowed  to 
change  as  the  storm  moved.  To  capture  this  change,  more 
than  300  historical  North  Atlantic  hurricane  tracks  (from 
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the  National  Oceanic  and  Atmospheric  Administration 
magnetic  tape  described  in  Appendix  F)  were  segmented  by 
latitude  bands  eight  degrees  in  width  with  a  three  degree 
overlap.  Larger  band  widths  masked  the  diurnal  effect, 
and  smaller  band  widths  resulted  in  too  few  observations 
per  track.  Segmenting  the  longitude  into  similar  bands 
increased  forecast  error  primarily  due  to  the  reduction  of 
latitude  observations.  Models  were  then  fit  to  each 
region.  The  models  are  discussed  in  Chapter  5  and  are 
listed  in  Appendix  A. 


Data  Manipulation 


For  each  individual  grid,  the  physical  manipu¬ 
lation  of  the  data  that  produced  the  model  coefficients 
required  five  major  steps.  In  the  first  step,  the  raw 
data  file  of  hurricane  tracks  was  condensed.  Storms 
that  occurred  before  1945  were  deleted  due  to  concerns 
about  the  accuracy  of  the  observations.  Then  portions  of 
storm  tracks  outside  the  current  grid  of  interest  were 
deleted.  Finally,  the  subtropical  storms  (maximum  wind 
less  than  45  knots)  were  eliminated  due  to  their  weak 
persistence.  In  the  second  step,  the  lag  one  coefficient 
for  storm  tracks  in  the  grid  was  calculated  and  used  to 
determine  the  stationarity  category  of  the  storm.  Next, 
based  on  the  stationarity  category,  the  latitude  and 
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longitude  series  were  differenced  appropriately,  and  the 
lagged  data  matrices  were  constructed.  In  the  fourth 
step,  the  least  squares  regression  coefficients  were 
calculated.  Finally,  in  step  five,  the  coefficients  were 
used  to  forecast  the  storms.  The  results  of  these 
forecasts  were  compared  with  the  "best  track"  data.  This 
produced  the  empirical  forecast  errors  used  in  the 
examples  presented  later  in  this  chapter.  A  detailed 
discussion  of  the  data  files  and  the  required  data 
manipulation  is  in  Appendix  F. 

UNIVARIATE  ESTIMATION 

When  stationarity  is  achieved  for  a  series,  and 
the  order  of  the  model  is  tentatively  identified,  the 
model  parameters  must  be  estimated.  The  next  several 
sections  include  detailed  discussions  of  the  joint  density 
of  the  time  series  observations  and  derivation  of  the 
maximum  likelihood  point  and  interval  estimators  for  the 
univariate  process. 

Joint  Density  of  the  Observations 

Let  wt  =  <P1wt_1  +  $2wt-2  +  •**  +  ^pwt-p  +  at 

where 

at  %  N(0,oa2)  ,  wt  v  N ( 0 , °w2 ) 


at  represents  a  random  shock  (input)  to  the  system  at 


time  t,  wfc  is  the  observed  value  of  the  series  at  time  t. 
Let  wn'  =  ( w  ^  ,  Wj,  ...  ,  w  n ) .  Suppose  the  are 
independent  Normally  distributed  random  variables  with 
zero  mean  and  constant  variance.  Then  the  probability 
density  function  (pdf)  of  w^  is 

f(wi=w|$,aw)  =  ( 2  tt  ow2  )  _1  /  2  exp(-w2/2aw23 

Because  the  observations  are  independent  and  identically 
distributed  (iid),  the  joint  pdf  is  given  by  the  product 
of  the  marginal  pdf's 

n 

f  (Wj.  ,w2  ,  .  .  .  ,wn  |  <p  ,ow)  =  (2irow2)_n/2  exp(  (-1/2)  Z  (wi2/aw2)3 

i  =  l 

Let  l  represent  the  diagonal  n  x  n  covariance  matrix.  Then 


because  the  are  iid.  Then 

f  (w1,w2,...,wn]lj),aw)  =  (2TT)“n/,2|E  ]  "1,/2  exp(  (-1/2)  wn'  E-1wn3  (4.9) 

If  the  Wj  are  correlated,  the  functional  form  of  the  pdf 
is  the  same,  but  the  off-diagonal  elements  of  Z  are 
nonzero  (Morrison,  1976). 


Box  and  Jenkins  (1976)  and  Hannan  (1970)  use  the 


notation 

f(wn|4>,o  a)  =  (2iro  a  2 )  — n  /  2  j  Mnp  |  2exp(  (-1/2  °a2)  wn'MnPwn3  (4.10) 

where  MnP  =  wn  is  a  vector  of  n  observations, 

and  p  is  the  order  of  the  AR  model.  Substituting  for  MnP 
in  equation  (4.10)  yields  equation  (4.9).  Consequently, 
the  exact  likelihood  is 

f  (wn|^,aa)  =  L(<J>,oa|wn)  = 

=  (2Troa2)_n/2  |  MpP|  1/2exp{-(l/2aa2)wn’  MnP  wn3  (4.11) 

where  lMnP|=lMpPl'  because  the  wt  are  uncorrelated 
with  observations  more  than  p  time  periods  in  the  past. 
Thus,  all  the  covariances  beyond  row  and  column 
p  in  M  P  are  zero,  and  with  o  2  factored  out,  the 
variance  elements  (from  p+1  through  n)  are  one. 

Maximum  Likelihood  Estimators 

The  method  of  maximum  likelihood  can  be  used  to 
derive  estimators  for  the  parameters  and  the  variance  of 
the  noise  series.  Using  the  likelihood  function  (4.11), 
the  log  likelihood  function  is 

Jt(^'0alwn>  =  (_n/2)ln2^_(n/2)lrDa2+(1/2)lnfMpPl-1/(2ja2,wn'MnPwn  (4-12>- 
Let  S  (<j>)  =  w n '  Mnp  wn  , 


then  3  £  /3cr  a  =  (-n/ 2 )  ( 2a  Ja  a  *)  -  (1/2)  (-2)  (a  a  J)  S  (4, ) 

=  (-n/aa)  +  S(4>)/aa3 

equals  zero  at  the  maximum,  which  implies 

3a2  =  S{<j>)  /n  . 

The  derivation  of  the  estimator  for  the  vector  <j> 
is  somewhat  more  tedious  because  MpP  and  Mn^  are  functions 


For  example,  for  an  AR(1)  process  with 


wn‘ =Cw1 , . . . ,wn3 


from  (4.12)  £(<(>,  aa  |  wn)  = 


(-n/ 2) ln2  ir (n/2)  In  aa2+(l/2)ln(l -4  x2)- (l/2cr a2)  C  d"^2) wx2+  Z  (wt-$  1wfc_1) 2)  (4.13) 

v  t=2  j 


Expanding  a  in  (4.13)  yields. 


Cw12-4>12w12  +  w22-2<).w1w2+4.12w12  +  Z  (wt-«)1wt_1) 2) 


=  [w12+w22-2<)i1w2w1  +  Z  (wt-<j»1wt_i)  2 1 
n-1  n 

=  [-2<})1w1w2+  Z  l)!i2wt2  “  2(f’i  2  wtwfc  ,  ] 
t  =  2  t  =  2 

Then 

n-1  n 

34/a+i  =  (1/2)  [(-2^)/ (l-^2)]  -  (l/2oa2)  [-2wiw2+2<h  IWt2-  E^wH) 

t=2  t=2 

n-1  n 

=  [aw2  ( !-<#»  1 2)  (-  fjJ  /  (1-  <^2)  ] -f-2w1w2  +  2  ^  Z  wt2-  I  wtwt_1  ) 


t  U  wtWt“  1  r  1  u  w.j_  • 

t=2  t=2 


Setting  this  equal  to  zero  and  solving  for  <p  ^ ,  yields 


i  1  =  (2  wtwt-l  "  Y i ) / (  £  w.2) 

t  =  2  t  =  2 


(4.14) 


Clearly,  4.14  is  asymptotica  1  ly  equivalent  to  the  ordinary 

least  squares  estimate  of  given  by 

n  n 

$-,  =  (!  )  /  (  Z  wt2 ) 

1  t  =  2  t  =  2 

When  n  is  large  enough.  Box  and  Jenkins,  and  the 
commercially  available  software,  prefer  to  ignore 
Y i  and  simply  use  the  ordinary  least  squares  estimate. 

A  convenient  iterative  least  squares  algorithm  will  be 
presented  later  in  the  paper. 

In  the  general  AR(p)  model,  the  <p  's  can  be 
estimated  by  conventional  ordinary  least  squares  or  by 
solving  the  Yule-Walker  equations.  The  Yule-Walker  esti¬ 
mates  are  calculated  by  solving  the  system  of  equations 


rl  =  <Pl  +  *2rl  +  *•*  +  *prp-l 

r  2  =  *lrl  +  h  +  • • •  +  iprp_2 


>lrp_l  +  ^  2rp-2 


where  r^  =  ck/cQ  and 


n-k 

ck  =  (1/n)  Z  wtwt+k 


Derivation  of  the  variance  of  the  maximum  likeli 


hood  estimators  is  reviewed  in  this  section.  Then,  an 
iterative  procedure  that  can  be  used  to  calculate 
estimates  of  the  autoregressive  parameters  is  discussed, 
and  those  estimates  are  compared  to  least  squares 
estimates.  An  actual  hurricane  track  is  then  used  to 
illustrate  the  variances  of  parameter  estimates  computed 
using  both  the  least  squares  and  maximum  likelihood 
criteria  approaches. 

The  variance  of  the  parameter  estimates  for  the 
autoregressive  process  is 

V[$]  =  I  ^  ( <p ) 

where  I  (<}>),  the  information  matrix,  is 

I  [*]  =  E[U'U]  oa 


and  U  is  an  nxp  matrix  containing  n+p  observations  lagged 
up  to  p  periods  (Box  and  Jenkins,  1976).  In  least  squares 


estimation  of  ut^j  is  defined  as  ut  j  =  -3at/3<f>j  given 
$  -j  0  (Granger  and  Newbold,  1977).  Thus,  uf  •  represents 

J  f  L-  f  J 

the  change  of  the  error  of  the  estimate  of  the  t*-^  value 

in  the  time  series  with  respect  to  changes  in  the  value  of 

the  jth  parameter.  The  gradient  (u+.  •  )  can  be 

r  /  J 

approximated  using  a  first  order  Taylor  series  expansion 
(Fig.  4.1)  as 


Ut,j  s  _(at,o  '  at)/(*j,o  "  V 
=  (at  ‘  at,o)/(*j,o  'V  * 

Solving  for  at 

at  =  “<*j  “  *j,o)ut,j  +  at,o  (4*16> 

where  j  ^ 0  ,  and  afc are  initial  guesses  of  the  parameter 
and  error  values  respectively. 


J 


The  one  dimensional  case  is  shown  in  Fig.  4.1. 
In  general,  for  the  AR(p)  model,  solving  for  at  yields 


P 

at  s  at,o  “  ^  j  "  ^q)  ut-j  *  (4.17) 

This  expression  is  the  first  order  Taylor  series  expansion 
commonly  used  in  nonlinear  gradient  search  algorithms 
(Granger  and  Newbold,  19  77).  If  afc  is  a  function  of  <b  , 
then 

ut  =  -3 at/3^ 


The  minimum  error  with  respect  to  the  parameters  is 
determined  along  the  direction  of  the  negative  gradient. 
For  moderate  and  large  samples,  when  an  ellipsoidal 
quadratic  approximation  to  the  likelihood  function  is 
appropriate,  the  global  minimum  occurs  where  the  sum  of 
squares  in  the  exponent  of  the  likelihood  function  is 
minimized.  Solving  (4.17)  for  at/0  yields 

P 

at,o  s  ^  "  ^o*  ut-j  +  at 


in  the  form  of  a  linear  regression  model, 
are  estimated  as 


The  (*j-4jf0) 


* 


3  'O 


(U'U)-1  U'  aQ 


where 


unl  un2 


'  at  f ( 0 , aa  ) 


At  the  final  iteration  the  well  known  result  of  linear 
least  squares  theory  yields 

n 

V[£]  =  S2(U'U)_1  where  S2  =  (n-p)-1  E  at2  .  (4.; 


The  covariance  matrix  of  the  estimates  is  also 


given  by 


V[{]  s  I-1  (<f>)  =  [E(U'U]0a2}"1  =  (nrp)_1oa2  =  MpP/n 

(Box  and  Jenkins,  1  976).  E(U'U)-1  can  be  shown  to  be 

equal  to  (nrp)-1  •  For  an  AR(p)  model 


wt  =  ,l  ^  jwt- j  +  a1 


so  a^-  =  w^.  E  wt-i 

j  =  l 

therefore 


Then  E[U'U]  contains  all  cross  product  terms  and 


E[U'U] 


Yp-1 

Yp-2 


nFP 


.  .  .  y 


O 


when  n  is  moderate  or  large.  Thus,  the  approximate 
covariance  matrix  of  the  parameter  estimates  may  be 
obtained  from  a  transformation  of  the  covariance  matrix  of 
the  data. 

A  Univariate  Example 

Consider  the  following  velocities  (in  degrees  of 
latitude  per  six  hour  interval)  of  an  example  hurricane 
in  1960  ( .  8 , 1 .0 , 1 . 1 , .  7  , .  1 , -.6 ,  -  .3  , .  6  ,  .7)  .  A  short 

series  was  chosen  to  better  illustrate  the  differences 
between  estimation  procedures.  An  AR(1)  model  was 
appropriate  for  this  series. 

Let  w  t  =  C .8 , 1.0 , 1.1  ,.7, .1 ,-.6 ,-.3  ,.6 , .7 } 

Then  there  are  eight  pairs  of  observations  that  can  be 
used  to  calculate  They  are  ( .  8 , 1 . 0 )  ( 1 . 0 , 1 . 1 )  ( 1 . 1 , . 7) 

(.7,.l)  (.l,-.6)  (-.6, -.3)  (-.3, .6)  (.6, .7).  Let  [x]  denote 
the  series  of  first  elements  in  each  ordered  pair.  Let  [y) 
denote  the  series  of  second  elements  in  each  ordered  pair. 


The  sample  mean  of  (x3  is  .425  .  The  sample  mean  of  Cy) 
is  .4125  .  Adjusting  [x]  and  [y3  to  zero  mean  yields 
X'  =  (.5875, .6875, .2875, -.3125  , -1.0125, -.7125, .1875, .2875  3 
Y'  =  (.375  ,.575  ,.675  ,  .275  ,-  .325  ,-1.02  5  ,-.725 ,  .1 75  3 . 
The  maximum  likelihood  point  estimate  of  is 
n  n-1 

=[  l  w.Wx.,  /  (n-1 )  ]  /  [  I  w,2/(n-2)]  (4.19) 

t  =  2  t=2 

=  [1.6975/8]  /  [2.5744/7]  =  .5769  . 

The  least  squares  point  estimate  is 

9  9  2 

^  =  [  l  wtwt-l]/[  1  wt  1  =  1.6975/2. 7150=. 6252  .  (4.20) 

t  =  2  t  =  2 

Now  in  the  usual  least  squares  regression  estimator 

=  (X'X>_1X'Y  (4.21) 

where 

8  8 

X  X  =  I  Xm2  =  E  ( w  t~ . 4  2  5 ) ^  . 

m=l  t=l 

The  question  of  whether  to  use  the  sum  of  the  wfc2  from 
index  2  to  9,  or  1  to  8,  to  represent  X'X  is  academic, 
because  as  the  number  of  observations  increases,  the  dif¬ 
ference  becomes  negligible.  In  fact,  the  SCA  time  series 
computer  package  uses  the  Gauss-Marquardt  search  algorithm 
which  converges  in  three  iterations  to  the  least  squares 
estimate  .6252  (Liu  et  al.,  1983). 


The  estimated  standard  deviation  of  the  maximum 
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likelihood  estimate  of  ^  /  for  a  first  order  model  is 

f.272  =  .5769 

std  dev  [?x]  =  [MpP/9]  =  [  (l-^!2) / 9 ] 1 / 2  =< 

1.260  $1  =  .6252 


From  least  squares  the  estimate  is 


9  9 

std  dev  [f,]  =  [S/ (X'X)  ] 1/2  =  [[  e  at2/  ( 8-1)  j  /  [  I  wt2] 

t=2  t=2 

=  { [1.699/ (8-1) ]/ [4.16] }l/2  =  .242  . 


SCA  calculates  the  standard  deviation  as 

std  dev  =  f [1.699/(8)]/[4.16]}1/2  =  .226 

which  uses  the  biased  maximum  likelihood  estimator  for  S. 

The  smallest  hurricane  model  has  178  observations 
and  most  parameter  standard  deviations  are  approximately 
.05.  Even  with  that  small  standard  deviation,  the 
estimates  of  <t>  ^  are  not  statistically  significantly 
different.  Since  the  SCA  estimate  matches  the  result  of 
SPSS  to  four  significant  digits  (all  that  SCA  reports),  it 
seems  appropriate  to  use  the  least  squares  point  estimator 
in  equation  (4.21).  In  addition,  because  of  the  way  the 
hurricane  tracks  are  segmented  by  region,  there  is  missing 
data  which  SCA  can  not  handle.  Thus,  the  least  squares 
estimates  from  SPSS  are  used  in  all  regions. 


! 


Univariate  Forecastinc 


The  time  series  models  are  to  be  used  to 
forecast  hurricane  positions.  It  is  usually  a  simple 
matter  to  calculate  a  point  forecast.  The  more  difficult 
task,  especially  in  the  multivariate  case,  is  to  calculate 
approximate  probability  limits  surrounding  the  point 
forecast.  As  a  prelude  to  the  bivariate  section,  this 
section  contains  the  development  of  the  univariate  point 
and  interval  forecasts. 

Suppose  an  AR  ( 1 )  model  has  a  nonzero  mean.  Then 


wt~  V  =  4>i(wt_i-y)  +  at 

Wt  =  ^lWt-l  ~  $1^  +  v  +at 

wt  =  1*,lwt-l  +  +  at 


Note  that  -<j>1y+y  can  be  thought  of  as  Y’-s^X  =  bo  (the 
intercept)  in  a  simple  linear  regression. 

For  the  AR(2)  model. 


wt“U  =  <P  i  (wt  — 1~ y )  +  <p 2  ( )  +  at 
wt-y  =  ^lwt-l  -  +  ^2wt-2  “  ^2V  +  at 

wt  =  <Plwt-l  +  <J>2wt-2  +  at  +  y  ( 1— l- 4> 2 ) 

In  general  for  the  AR(p)  process, 

wt  =  ^lwt-l  +  **•  +  ^pwt_p  +  y  ( 1  <#>2  <#>2  - *p>  +  at 


..A 


-  %  A  -V  *-  A  A  A\Vi\ 


Consider  the  one  step  ahead  forecast  of  the  AR(1) 


process  with  nonzero  mean. 

wt  (1)  =  4>1(wt-p)  +  y  =  ^2.wt  “  ^1^  +  y 
Then  the  two  step  ahead  forecast  is 


wt(2)  =  «j>1  (wfc  (l)-y)  +  y 

=  <j>1C(j>1wt  -  ^1y}  +  y 
=  4>12(wt  -  y]  +  y 

and  the  £fc^  step  ahead  forecast  is 


wt(£)  =  ^  C  w^_  —  y  3  +  y 

For  stationarity  it  has  been  shown  that  1 ^  j  <  1. 
Therefore , 

lim  wt ( £ )  =  y 

£  -*•<*> 


and  the  forecast  converges  to  the  mean  of  the  series.  For 
hurricane  forecasting  this  means  that  the  greater  the  lead 
time,  the  less  information  one  has  concerning  the  storm. 
Consequently,  the  average  velocity  of  past  storms  is  the 
best  guess  for  a  predicted  path  when  the  lead  time  is 
large . 

The  forecast  error  for  the  AR(1)  one  step  ahead 


forecast  is  given  by 


=  (wt-p)  +M+at+1}-[4>1  (wt-p)  +  yl 
=  at+l  * 

Thus,  the  one  step  ahead  forecast  error  is  the  noise 
series.  The  error  for  the  two  step  ahead  forecast  is 


et  (2) 


wt+  2  ~  °t  +  2  =  Wt  +  2 
C*1  {wt  +  l'M)  +1J  +  at+2 
at  +  2  +  <^l  ^  ^wt+l_y  ^ 
at+2  +  4ll  (at  +  l)  * 


-  wt(2) 

3-C  1 2  (wfc-y )  +y  3 

l (wt-p)  ] 


Similarly,  the  error  in  the  step  ahead  forecast  ii 


e.  («,)  =  a 


t  +  A 


+  (f> 


1 (at+£ 


-l)+<j>l  (at+A-2) 


A-l. 


lt+l 


and,  because  the  afc  are  independent  and  identically 
distributed  (iid),  the  variance  of  the  step  ahead 

forecast  is  given  by 


V[etU)  l=V[at  +  )l]+4.12V[at  +  jl.1]^14V[at  +  jl.2]  +  -** 

+  h2£"2Vtat+13 

=  oa2  [ (l-*!2) ]  • 

For  the  AR{2)  model,  the  one  step  ahead  forecast  error  is 


etU)=  wt+1  -  wt  +  1 

=  t$l  (wt-y)  +(j>2  <wt_1-y  )  +  y  +  at  +  13-{4>1  (wt-y)  +(j>2  (wt-l“H)  +y 


-  I  (wt  +  ^-y  ) +(j>2  (wt~y  )  +  p  +  at  +  2  J 
-l$1(wt(l)-u)+  <t>2  (wt~  P)  +  U  3 

=  Ulwt  +  i  +  ^2wt+at  +  25'^l  1 4>  i  +  $  2wt-l  ^  +  ^  2wt  ■* 
=  *i(wt+1-(^1wt  +  *2wt-1) )  +  at+  2 
=  afc  +  2  +  4>1at+1 


These  results  are  based  on  the  assumption  that  the  <f>^  are 
constant  parameters  and  oa  is  known.  In  practice  these 
quantities  are  random  variables  and  must  be  estimated. 
Consequently,  the  confidence  regions  for  the  forecast 
error  are  approximate. 

The  calculation  of  et(£),  for  AR  models  is 
simply  an  algebraic  exercise.  However,  as  l  increases 
the  computation  becomes  laborious.  Indeed,  there  is  an 
easier  way  to  calculate  the  £th  step  ahead  forecast  error 
for  the  AR(p)  model.  It  involves  the  ij;  weight 
representation  of  the  AR  process  which  is  defined  by 
expressing  wt  in  terms  of  the  past  values  of  the  noise  at. 
The  psi-weight  form  of  the  AR(p)  model  is  derived  from  the 
difference  equation  form  of  the  AR(p)  model 


wt  =  -hwt-l  +  *2wt-2+’"+*pwt-p+at 


then 


(l-^B-^B2-’  •  --*pBP)wt  =  at 
wt  =  ( 1  —  4>1B— 4>2B2—  *  *_<f>pBP)  _1at 
=  (l-I^B-tflnB2-  •••  '*■ 
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-  ^ 1^ t— 1~ ^ 2^ t— 2~  " 


For  the  AR(1)  model,  the  weights  are  successive  powers 


of  <(>  ^ . 


wt  _  U  =  4>]_  (wt-i~v )  +at 
( 1  — <f>  1B )  (wt-p)  =  at 

wfc  -  p  =  (!  —  <#> 3_B)  “1at 


wt  =  p+at  +  ^1at_1  +  4’^/^at_2+ '  ' ’ 


’here  the  afc  are  the  residuals  of  the  one  step  ahead 
forecasts  of  the  previous  observations. 

For  the  AR(2)  model, 

wt  -  u  =  (1-<|,1B-<J>2B2)  _1at 

Expanding  (1- ^B-  <f>2B2) -1  by  a  Maclaurin  series  yields  the 
approximate  if,  weights  (Abraham  and  Ledolter,  1983). 


f  (B)  =  (l-<fr,B-*9B^) 


2,-1 


f ' (B)  = 


\ 1^2  y 2°  ' 

(^1  +  2(^2B)  { 1-<#»1B-4>2B2 ) -2 


f”(B)  =  2  [  (♦1  +  2  +  2B)  ^  (l-^B-^B2)  ~3  +  <p2  d-<)'1B-l)>2B'i)-z] 


,2,-2 


f 1  "  (B)  = 


6  [  (4>1  +  24>2B)  3  (1— 4>  x B-  ^2B2)-4 
+  (  ♦  x  +  24»  2b)  (1“4>  iB-<J>  2B2)- 3  ( 24,  2)  ] 


f ' (0)  = 

f  *  '  (0)  =  2  [  ( 4>x)  2  +  <#»23 

:  *  "  (0)  =  6  [  (4>1)  3  +  24>14>23 


«  .  *•  .  •*.  i‘  *  , * 
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wfc  -  V  =  +  +  (4>13  +  2<l>14>2)  B3  +  *  *  *  ]  at 

wt  =  vi+  [^0  +  4'2B  +  ^2B^  +  >l'3B^+  *  *  •  ]  at  .  (4.22) 

Equation  (4.22)  may  be  expanded  to  any  desired  degree  of 
accuracy  and  is  easily  obtained  for  the  low  order  AR 
models  with  which  one  is  usually  concerned.  For  example, 
consider  the  AR(4)  model  with  ^  and  ^  equal  to  zero. 

Then 

2  3  4 

’I* ~ 41  ]_  /  ^  2  =  ^  1  >  ^ 3  =  ^  1  ’  ^4  =  (^i  +  41 4 

and,  in  general. 


*j  =  <Mj-l  +  (Mj-4  • 

Next,  the  forecast  error  for  the  AR(p)  model  is 

j_  y. 

derived  in  terms  of  the  ij)  weights.  Let  the  J6  step 
ahead  forecast  be  given  by 

wfcU)  =  y  +  C0at  +  C1at_1  +  C2at_2+ • *  •  (4.23) 

where  the  E,  j  are  to  be  determined.  Then  the  forecast 
error  (known  only  after  time  t+£)  is 

( £ )  =  w^+  £~ Wt  ( £ )  =  (  v  +  ’I'o^ t  + 1, +  ^  la t  +  Z — 1 +  ^ 2a t  +  SL  —  2  +  *  *  *  ^ 

~(M+C0at  +^lat-l  +^2at-2  +'*') 

and  because  the  at  are  iid,  the  variance  of  the  step 


ahead  forecast  is  given  by 


E[etzU)]  =  (at+£+^lat+£-l+^2at+£-2+*  *  "+^je,-iat+l 


-  at+  +  at-l+ - 5  2 


=  (l  +  ^12  +  ^22+*  *  *+^e,-i2>aa2  +  z  (  ^  £  +  i -C  -; ) 

-  -  j=0  J  J 


2  2 
°a 


This  variance  is  minimized  when  +  j  =  E,  j  . 

Thus,  the  minimum  mean  square  error  forecast  from  (4.23) 
is 


wt(£)  _  P  +  ’jJjlat  +  ij>£  +  iat_1  +  ijjJ1  +  2at_2  + 


Therefore , 


etU) 


(n  +  at  +  i  +  *lat  +  t-l  +  *2at  +  *-2+",) 

-  (p  +  <jj£at  +  ^£  +  iat_i  +  ^£  +  2at-2+"  *  *  > 

(at  +  £  +  ^at  +  £-l+*  *  '+*£-lat  +  l  +  ,<;Jlat  +  ^  +  lat-l  + 

~  ^£at  +  ,*'£  +  lat-l+“’) 

at  +  £  +  ^lat  +  £  — 1+*  *  "+<^£-lat  +  l  ' 


*) 

(4.24) 


and  the  variance  of  the  i,*-*1  step  ahead  forecast  error  is 


V[et  U)  ] 


E£at  +  s,2J  +<('i2E[at+£-i2]  +  *'*  +  ,i,£-i2ECat  +  i2] 
aa2<1  +  'i*l2  +  '(j22+’  "+^£-l2>  •  (4.25) 


There  are  three  important  conclusions:  (1)  the  forecast 

error  is  a  linear  sum  of  independent  Normally  distributed 
noise  terms,  and  so  is  Normally  distributed  with  mean  zero 
and  variance  given  by  (4.25),  (2)  the  forecast  error  is  a 

function  of  unknown  future  shocks  (noise)  which  must  be 
estimated,  (Consequently,  forecast  errors  beyond  one 


period  are  correlated.),  (3)  this  approach  assumes  that 
the  (and  consequently  the  $  ■  )  are  constant  and  cr  2  is 
known.  In  practice  these  quantities  are  random  variables 
which  must  be  estimated.  Thus,  the  confidence  region 
surrounding  the  forecast  is  approximate.  Also,  if  the 
variances  of  the  <J>  ^  are  large  (as  in  the  univariate 
example)  the  confidence  interval  may  be  severely  under¬ 
stated  (Pankratz,  1983). 

In  order  to  reduce  the  variance  of  the  parameter 
estimates  it  was  necessary  to  determine  a  way  to  combine 
hurricane  tracks  so  as  to  maximize  the  number  of  observa¬ 
tions  in  each  grid.  This  was  accomplished  by  differencing 
the  individual  storm  tracks  (if  required)  and  then  using 
the  pairwise  deletion  option  of  SPSS.  For  example,  two 
tracks  were  being  combined,  each  having  ten  observations. 
Then  for  the  velocity  model  there  were  18  observations 
available  to  compute  the  lag  1  parameter  <j>^,  16  available 
to  compute  4>  2 /  14  available  to  compute  <£3  and  so  forth. 
This  procedure  increased  the  number  of  observations  in  a 
particular  region  and  so  decreased  the  variance  of  the 
parameter  estimates  (to  approximately  .0016  for  a  parame- 


Univariate  Interval  Forecast  Example 

In  this  example,  for  the  arbitrary  region  10-15 
degrees  north  latitude,  and  45-83  degrees  west  longitude 
confidence  intervals  associated  with  the  24,  48,  and  72 
hour  latitude  forecast  are  desired.  A  univariate  AR(4) 
model  of  the  applicable  latitude  position  series 
(differenced  once)  yields  the  model 

Velt  =  .69558Velt_1+  .10069Velt_4 

where  Velt  =  (LAt-LAt-1) ,  and  LAt  is  the  latitude  (in 
degrees)  at  time  t.  Consequently, 

LAt  =  1.69558LAt_1  -  .69558LAt_2  +  . 10069 (LAt_4-LAt_5 ) 

The  psi  weights  for  this  model  are  computed  as  shown  in 
the  previous  section.  Here  #  =1,  ^  =  1.69558, 

4>2  =  1.69558*1+(-.  69558)*0  =  2.17941 


<l»3  =  <j>- Llf)2 

+ 

<p2*Pi 

= 

2.51595 

'1*4  =  <P1'P  3 

+ 

<P2'I>2 

+ 

U+o 

=  2.85073 

♦5  =  ♦^♦4 

+ 

+ 

+  =  3 . 15364 

*6  =  *1+5 

+ 

♦2*  4 

+ 

U*2 

+  $5^1  =  3.41305 

4>7  =  <P  1*6 

+ 

^2^5 

+ 

+  <j>  cjip  2  =  3.62737 

Similarly, 

(j/8  =  3 . 81016  ,  i|/g  =  3 . 96781  ,  i)/^g  =  4 . 10358  ,  <^^  =  4.21960 


The  one  step  ahead  latitude  forecast  errors  have  mean  zero 
and  standard  deviation  9.1645  nautical  miles  (n  mi). 

Thus,  the  variance  of  the  24  hour  forecast  is 


V[efc  (4)  ]  =  (9.1645)  2  (l  +  4,12  +  4,22  +  ^32)  =  84.0(14.9548)=  1256.2 

and  the  standard  deviation  (SD)  of  et(4)  is  35.4  n  mi. 
This  compares  favorably  with  the  empirical  standard 
deviation  of  35.9  n  mi  obtained  from  the  forecast  models 
in  Appendix  A.  For  the  48  and  72  hour  forecast 

SD [et (8)]  =  69.7  n  mi 
SD[et(12)]=  101.5  n  mi 

compared  to  the  empirical  values  of  67.0  and  82.2  n  mi 
respectively.  Consequently,  the  90%  confidence  intervals 
are  the  point  forecasts  ±  58.2,  114.6,  and  167.0  nautical 
miles  respectively. 


BIVARIATE  ESTIMATION 


Let  CZ^t3,  fz2,t-*  two  time  series,  each 
n  observations.  If  the  series  are  autoregressive  of 
p  then 

Zl,t  =  +ll,lzl,t-l  +  *ll,2Zl,t-2  +  *“  +  *11  ,pZl  ,t-p 
+  *12,lZ2,t-l  +  *'*  +  *12,pZ2,t-p  +  C1  +  al,t 


with 

order 
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and 

Z2,t  =  *21,lzl,t-l  +  +  *21,pzl,t-p 

+  *22 , 1Z2 , t-1  +  +  *22 , pZ2 , t-p  +  c2  +  al,t 


or,  in  matrix  notation. 


— 

— 

z  1  f  t 

= 

*11,1 

*12,1 

Zl,t-1 

Z2,t 

*21,1 

*22,1 

Z2 , t-1 

*11, P 

f 12  /P 

Zl,t-p 

+ 

C1 

+ 

*21, p 

*22, p 

Z2,t-p 

C2 

a2,t 

or. 


C  +  ah 

'V, 


(4.26) 


where  the  [af]  are  iid  N(0,2  ).  The  a*,  noise  series 

'X/  'Vy 

are  assumed  to  have  zero  covariance  except  at  time  t. 
That  is,  for  two  noise  series,  say  [a1)  and  [a2l, 


E  [a 


lft 


,a2,s1 


0  t  *  s 

I  t  =  s  , 


C  is  a  2  x  1  vector  of  constants,  and 


<)>(B)  =  I  -  <f> -  •••  -  <f>pBP 


where  ^  is  a  2  x  2  matrix  of  coefficients  at  lag  i  as 
shown  in  (4.26)  above.  The  individual  series  are  assumed 
to  be  weakly  stationary. 

The  model  in  (4.26)  can  be  expressed  as  a  multi- 
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variate  linear  model  and  £  can  then  be  obtained  via  multi¬ 
variate  least  squares  (Tiao  and  Tsay,  1983).  Specifical¬ 
ly,  let 


zVi 

1  z'p  z\ 

•  •  • 

C* 

♦  i  ' 

1 

a’P+i 

• 

Y  = 

1 

[S3 

3  •  • 

L 

,  x= 

•  •  • 

i  ^ ' n-1 *  *  *  ^ ' n-p 

,  n  = 

\  a. 

>  •  •  -O- 

_  J 

r  C  = 

a;' 

Then  Y=Xn+e  is  a  multivariate  linear  model.  Least  squares 
estimates  can  be  obtained  directly  (Hillmer  and  Tiao, 
1979).  Although  they  are  biased  (Granger  and  Newbold, 
1977) ,  the  least  squares  parameter  estimates  converge  in 
probability  to  the  true  parameter  values  (Hannan,  1970). 

Bivariate  Point  Forecast 

Similar  to  univariate  models,  bivariate  forecasts 
are  based  on  the  difference-equation  form  of  the  model. 
Specifically,  the  general  form  of  the  AR(5)  position  model 
used  in  all  regions  is 


LAt~*ll,lLAt-l  +  <*’ll,2LAt-2  +  (*lll,3LAt-3  +  <*’ll,4LAt~4  +  ‘*,ll,5LAt-5 
+  ^12,lLOt-l  +  ^12,2LOt-2  +  <*'l2,3LOt-3  +  <i,12,4LOt~4  +  ‘*'l2,5LOt-5 


where,  LAt  represents  the  latitude  at  time  t,  LOt 
represents  the  longitude  at  time  t,  and  and  C2  are 
scalar  constants. 

For  the  region  25-30  degrees  north  latitude, 
45-100  degrees  west  longitude  (the  Gulf  of  Mexico  coastal 
region),  the  model  is 

LAt=l .777LAt_1-.777LAt_2+ .0l6LAt_4- ,016LAt_5 

-.101LOt_1+.101LOt_2+.083LOt_4-.083LOt_5+.071 

LOt=.103LAt_1-.103LAt_2-.198LAt_4+.198LAt_5 

+1.837LOt_1-.837LOt_2+.032LOt_4-.032LOt_5+.052  . 

Parameters  not  significantly  different  from  zero  have  been 
included  for  ease  of  programming,  and  model  comparison. 

The  one  step  ahead  forecast  position  (LAt,LOt)  is  based 
on  the  use  of  the  five  previous  (six  hour)  position 
reports.  To  obtain  the  two  step  ahead  forecast,  (LAt,LOfc) 
is  treated  as  the  last  observed  position,  and  the  one  step 
ahead  forecast  (from  time  t)  is  computed.  Forecasts  for 
lead  times  up  to  n  steps  ahead  are  computed  in  a  similar 


manner . 


Point  Forecast  Example 

Consider  the  following  five  position  reports  from 
hurricane  Elena  (1985). 


(LAt_5,LOt_5)  =  (28.7,  85.2) 

(LAt_4,LOt_4)  =  (28.8,  84.4) 

(LAt_3 ,L0t_3)  =  (28.9,  83.9) 

(LAt_2 ,L0t_2)  =  (28.8,  83.8) 

(LAt_1,LOt_1)  =  (28.5,  84.0) 

The  last  position  report  (time  t-1)  was  for  0500  Central 
Daylight  Time  (CDT)  on  September  1.  Actual  landfall  of 
Elena  was  54  hours  later  at  Gulfport,  Mississippi  (30.3, 
88.8).  The  seguence  of  position  forecasts  from  the  25-30N 
latitude  model  is  (28.3,84.1),  (28.1,84.3),  (28.0,84.4), 

(28.0,84.6),  (28.1,84.9),  (28.2,85.3),  (28.3,85.6), 

(28.4,86.0),  (28.6,86.4).  These  forecasts  have  an 
important  characteristic.  Even  though  the  initial  motion 
vector  points  to  the  southwest,  the  model  "turns"  the 
storm  to  the  northwest.  That  is,  given  that  the  hurricane 
is  "wandering"  (as  Elena  was  at  this  point)  the  forecast 
of  future  motion  is  the  average  velocity  vector  of  past 
storms  which  points  to  the  northwest.  The  landfall 
forecast  error  of  the  time  series  model  is  approximately 
150  n  mi. 

This  is  more  accurate  than  other  forecasts  of 
this  storm.  Model  A  (Freeze,  1983),  which  depends 
heavily  on  the  initial  motion  vector,  forecasts  landfall 


67 


in  72  hours  on  the  Yucatan  peninsula.  The  error  of  model 
A  is  approximately  480  n  mi.  At  the  time  of  this  position 
report  the  National  Hurricane  Center  had  hurricane 
warnings  posted  from  Pensacola,  Florida  to  Fort  Meyers, 
Florida  and  was  forecasting  landfall  near  Cedar  Key, 
Florida.  This  was  an  error  of  approximately  250  n  mi. 
Thus,  it  appears  the  time  series  approach  is  potentially 
effective . 

Interval  Forecast  of  One  Variable 

If  the  random  shocks  Calt),  (a2t)  are  Normally 
distributed  (as  assumed),  and  the  appropriate  model  has 
been  estimated  with  a  sufficiently  large  sample,  then  the 
forecasts  are  Normally  distributed  (Pankratz,  1983). 
Consequently,  appropriate  90%  confidence  intervals  for  the 
step  ahead  latitude  or  longitude  forecast  are 

LAt  U)  ±  1.645  SD[eltU)  ] 

LOt  (A)  ±  1.645  SD[e2t  U)  ] 

where  e^t(£)  and  e2|-(A)  are  the  A*'11  step  ahead  forecast 
errors  for  latitude  and  longitude  respectively. 

In  the  bivariate  case,  the  psi  weights  necessary 
for  computation  of  e^t(£)  and  e2t^)  are  computed  using 
the  general  model 


•  *>  ■ 


LAt=*ll,lLAt-l+*ll,2LAt-2+*ll,3LAt-3+*ll,4LAt-4+*ll,5LAt-5 
+  *12,lLOt-l  +  *12,2LOt-2  +  *12,3LOt-3't'*12,4LOt-4  +  *12,5LOt-5 
+  C1  +  alt 

LOt=<^21 ,  lBAt-l+*21 , 2BAt-2+*21 , 3BAt-3+*21 , 4BAt-4+*21 , 5LAt-5 

+  *22,lLOt-l  +  *22,2LOt-2  +  *22,3LOt-3  +  <*'22,4LOt-4  +  <,,22,5LOt-5 
+  c2  +  a2t  , 

where  a^t  and  a2t  represent  the  noise  in  the  forecast  for 
period  t  for  latitude  and  longitude  respectively.  Solving 
for  LAt  and  L0t  in  terms  of  the  <j>  (B)  polynomials, 
and  eliminating  the  constants  (which  do  not  contribute  to 
the  error),  yields 

LAt  =  (|»i2/iB-|-<i>i2,2B2  +  <*>i2,3B3't'(?,12,4B4't'lt>12,5B5)LOt+alt  (4-27) 
d“*il,iB“*n,  2B^~*11 , 3b3-^11 , 4b4-(^11 ,  5B^ 

L°t=  ^21,1B+i^21,2b2^^21,3b3  +  ^21,4b4  +  ^21,5b5^  BAt  +  a2t  *4'28) 
d“*22 , 1B~*  22 ,2b2~^22,3b3~^22,4b4-i^22,5b5^ 

Solving  these  two  simultaneous  linear  equations  for  LAt 
and  L0t  in  terms  of  a^t  and  a2t  results  in 

LAt=f  ( *12 , 1B+  ’  *  ’+(J,12,5B5)a2t+  ^ 1-<^  22 , 1B~  *22 , 5b5  )  alt 3  / 

£  d-^H,!8  -*H,5b5J  d-*2  2,lB - *22,5b5)_ 

(*12,1b+"'+*12,5b5)  (*21,1b+'  ‘  *+*21,5b5)  3  (4.29) 

L0t=[  ( «#> 21 , 1B+  '  ’  *  +  *21,5B5>alt+  d-<f»n,iB  *ll,5B5)a2t3/ 

£  (1-*11,1B - -*U,5b5)  d-*22,lB - -*22,5b5)_ 

(*12,1B+’ * *+*12,5b5) (*21,ib+* * *+*21,5b5) 3  *  (4-30) 


where  the  ratios  of  the  above  polynomials  yield  the  psi 


weights.  The  error  of  the  &th  step  ahead  forecast  from 
time  origin  t  is  given  by 

elt  (A)  =ait  +  ^n,iait+Ji-l  +  ^ll  ,2alt+£-2+"  ‘+'f'll,Jl-lalt+l 
+^12,la2t+£-l+*  “+^12,£-la2t+l 

e2t (  =a2t  +  li'22,la2t  +  £-l  +  ^22,2a2t  +  £-2  +  *  *  * +^22 , £-la2t+l 
+^21,lalt+£-l+‘ * *+^21/£-lalt+l  • 

The  cov(alt,a2s)  =  0,  t  *  s,  so 

E[elt2(£)]  =  U  +  4-llfl2  +  K»llf22+'**+^llf£-12)Oal2 
+  {^12,12+'  *  *+'*,12,£-l2)aa22 

+  2  (*11, 1*12, 1+  ‘  *  '+’l'll,£-l’,'l2,£-l)E[ala2J  (4-31) 

E[e2t2(^)]  =  d  +  ,),22f  12  +  I,'22,22+’ ‘  *+^22,£-l2)oa22 
+  ^21,l2+***+^21,£-l2)aal2 

+  2 (^22, 1^21, 1+*  * *+^22,£-1^21,£-l)Efala2J  (4*32) 

E[elt  (£)e2t  (£)  ]  =[  (1  +  <(»11  ,  1*22,1+  ”  ’  +  *11 ,  £-1^22  ,  £-l) 

+  (^12 , 1^21 , 1+ " * *+^12,£-1^21,£-l^ 3  E [aia2 ] 

+  <*11,1*21,1+*  *  *  +  ^ll,£-1^21,£-l30al2 
+  ^12 , 1^22 , 1+  *  ’  '+,<'12,£-1^22,£-I)aa22*  (4*33) 


For  the  Gulf  coast  region  discussed  in  point 
forecasting,  the  covariance  matrix  (E )  of  the  one  step 
ahead  forecast  forecasts  is 


212.576  -28.936 
-28.936  307.301 

By  (4.31)  the  variance  of  the  eight  step  ahead  forecast 
for  latitude  is 

V[elt (8)  ]  =  (l  +  1.7772  +  2.3702  +  2.8152  +  3.1552  +  3.43  92  +  3.6942 
+  3 . 9322)  (213.16) 

+  (0+ (-.101)  2+ (-.264)  2+ (-.460)  2+ (-.587)  2  +  (-.665) 2 
+  (-.715) 2+  (-.752) 2}  (307.301) 

+  2  (-11.837)  (-28.936)  =  15927.0  . 


The  corresponding  standard  deviation  of  126.2  n  mi 
compares  favorably  with  the  empirical  value  of  137.2  n  mi. 


Interval  Forecast  of  Two  Variables 

Results  of  chi-sguare  tests  of  latitude  and 
longitude  fo,  ^cast  errors  for  category  five  storms  do  not 
reject  the  hypothesis  that  the  joint  distribution  of  the 
forecasts  is  bivariate  Normal  (Appendix  D) .  Specifically, 


the  distribution  is 


f(LA,LO)={  (2^2)  (1-P2)  1/23  1  exp  (-1/2)  (1/ [l-p2])(  (  [LA-p^ /o^  2 

-2p  ([LA-u1]/a  x)  ([LO-w2]/o  2)  +  (LO-p2]/o  2)2}  (4.34) 

where  o  ^  and  a2  represent  the  standard  deviations  of  the 
latitude  and  longitude  forecasts  as  computed  from  the  psi 
weights  and  (p  ^  p2)  is  the  point  forecast.  The  exponent 

of  (4.34)  can  be  used  to  specify  the  equation  of  a 
confidence  ellipse  in  two  dimensions  when  it  is  set  equal 
to  some  positive  constant  (say  c)  (Morrison,  1976). 

Let  (Xi ,x2)  be  any  point  on  the  ellipse.  Then 
(Xi-pi,x2-p2)  is  the  vector  from  (xlfx2)  to  the  center 
(Hf, p2).  The  length  of  the  vector  is  maximized  when 
(x-p)'(x-p)  =  f  (x^-p-^)  (x2~p2)  is  a  maximum,  subject  to 

(1-P2)  _1 1  (x1-p1)  /o^2- 2p  [  (x1— y )  /  a-jJ  [  (x2~p2)  /  a2J 

+ [ (x2~p2) /a2]  =  c 

This  optimization  problem  can  be  solved  by  use  of  the 
Lagrangian  L  where 

L=(x1-p1)2+(x2-p2)2-X(l-p2)-1[[(x1-p1)/o1]2 

-2P  ( (xj^-M!)  /  [  (x2- M2)  /  a2^  +  f  <x2“  ^2^  a2^  2  -c  J 

Taking  partial  derivatives  to  find  a  stationary  point. 


3L/9xi =2 (Xi -p I )  “2 A (l-p2)-^! ( x 1  —  p 1 )/ a  1 2 1 


+  2X  [  (  p/  1- p2)  ]  [  (x2-M  2)  /  (°  i°  2)  ^  =  0  ' 


3L/3X2=2(x2-U2)“2x(l~P)f<x2“p2)/02j 

+  2A[(p/l-p2)]  [  (xi~\i  1)/(o1o2)]  =  0 


or,  equivalently 

I (X-p) -Al-1 (X-p)  =  0 

( I- A Z-1 )  (X-p)  =  0  .  (4.35) 

Pre-multiplication  by  I  yields 

{  E- A I )  (X-p)  =  0 

Thus,  the  coordinates  specifying  the  principle  axis  of  the 
ellipse  lie  on  the  eigenvectors  of  £.  Pre-mul tiplying 
(4.35)  by  (X-p)'  yields 

(X-p)' (X-p)  *  A (X-p) 'E-1 (X-p) 

=  Ac 

Then  the  length  of  the  principal  axis  is  maximized  when 
A  is  the  largest  eigenvalue  of  E.  The  length  of 
the  axis  is  2(A1)1/2c,  where  X1  is  the  largest  eigenvalue 
of  Z,  and  c  is  the  number  of  standard  deviations  for  the 
chosen  confidence  interval.  The  covariance  matrix  I  is 
the  matrix  for  the  l th  step  ahead  forecast,  and  is 
computed  from  equations  (4.31),  (4.32),  and  (4.33). 

Bivariate  Interval  Forecast  Example 

In  this  example  the  90%  confidence  ellipse  is 
derived  for  the  48  hour  forecast  for  the  Gulf  coast 
region  (25-30N,  45-100W).  From  the  previous  example,  the 


standard  deviation  of  the  latitude  forecast  is  126.2  n  mi. 
For  longitude  the  eight  step  ahead  forecast  is  given  by 
(4.30)  where 

LOt=[ ( .103B-.103B2-.198B4+.198B5)alt 

+ (l-1.777B+.777B2-.016B4+.016B5)a2t3/ 

1(1-1 . 837B+ . 837B2-. 032B4+. 032B5) 

(1-1.777B+.777B2-.016B4+.016B5) 

- (-.101B+.101B2+.083B4-.084B5) 

(.103B-.103B2-.198B4+.198B5) ) 

=  C (.103B-.103B2-.19  8B4+.19  8B5)alt 

+ (l-1.777B+.777B2-.O16B4+.016B5)a2t)/ 
{1+3.614B+4.888B2-2.935B3+.592B4+.105B5-.067B6 
+.009B7+.017B8+.032B9+.017B10}  . 

Then  the  estimated  variance  of  the  eight  step  ahead 
longitude  forecast  is 

V[e2t (8) ] = (0+.1032+.2692+.4692+.4842+.3852+.2192+.0162) aal 
+ (12+1.83 72+ 2. 5282+ 3.0902+3. 5722+4.0092+4.4212 

+4.8172) (oa2) 2 

+  2  (6.593)  (-28.936) 

=  .728  (212.576) +91.893  (307. 301) -381. 55 
=28012.154  . 

Then  the  estimated  standard  deviation  of  the  eight  step 
ahead  longitude  forecast  is 


The  estimated  covariance  of  the  eight  step  ahead  longitude 
and  latitude  forecast  is  computed  as 


E[e, t(8)  e2t (8)] =[1+1. 777(1. 837) +2. 3 71 (2. 528) +  2. 815 (3. 090) 

+  3-155 (3. 521) +3. 439 (4. 009) +3. 694  (4. 421) 

+  3.932  (4.817)  +  (-.101)  (  .  103)  +  (- .  264)  (.269) 

+  (-.460)  ( .469)  +  (-.587)  ( . 4 85 )  +  (- . 667 )  (.382) 
+  (-.715)  ( .213)  +  (-.  752)  ( .  016) ]  (-28.936) 

+  Cl. 777 (.103) +2.370  (.269) +2.8 14  ( .469)  + 

+  3. 155 (.484) +3. 439  (.382) +3. 694  (.213) 

+  4.144  (-.191)  3  (212.576) 

+[ (-.101) (1.837) +(-.264) (2.528) 

+  (-.469)  (3.090)  +  (-.586)  (3.571) 

+  (-.669)  (4.009)  +  (-.715)  (4.420) 

+  (-.  752)  (4.817)  3  (307.301) 

=  -5498.78 


where  -28.936,  212.576,  and  307.301  are  elements  of  the 
empirical  one  step  ahead  forecast  covariance  matrix 
evaluated  previously.  Then  the  estimated  correlation  of 
the  eight  step  ahead  forecast  is 

p  =  -5498.78/ [ (167.4)  (126.2) ] 

=  -.260 

where  167.4  and  126.2  are  the  estimated  standard 
deviations  for  longitude  and  latitude. 

The  estimated  covariance  matrix  is 


l  = 


(167.4) 


(-.  260)  (167.4)  (126.2) 


(-.260) (167.4) (126.2) 


(126.2) 2 


r 


28022.80  -5492.72 


n 


The  estimated  eigenvalues  (Aj_,A2^  are  solutions  of  the 

quadratic  equation 

(28022. 80-A)  ( 15926 . 44  A ) -5492 . 722  =  0  . 

Then , 

A2  -  43949. 24A  +  416133470  =  0  , 

which  has  roots 

Ai  =  13804.52 
A2  =  30144.72  . 

The  corresponding  eigenvectors  are  <.360,.938>,  and 
<-.938,.360>.  The  second  eigenvector  orients  the  major 
axis  of  the  confidence  ellipse  and  has  length 

2  (30144.72) 1/2  (1.645) =571.2  n  mi  . 

The  minor  axis  has  length  386.6  n  mi. ,  lies  on  a 
magnetic  heading  of  tan-1  (.360/. 938)  =  21.0  degrees,  and 
is  orthogonal  to  the  major  axis.  The  ellipse  is  centered 
at  the  point  forecast. 

In  general,  the  equation  of  the  90%  confidence 
ellipse  for  any  forecast  is  given  by 

(x1/a1)2-2p(x1/a1)(x2/a2)+(x2/a2)2=1.645(l-p2) 

where  p  is  the  correlation  between  longitude  and  latitude 
forecasts  and  a-^  and  a2  are  the  standard  deviations  of  the 
longitude  and  latitude  forecasts  respectively. 
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Thus,  for  the  48  hour  forecast  for  the  Gulf  coast 


region,  the  equation  of  the  90%  confidence  ellipse  is 

(x1/167.4) 2-2 (-.260) (Xj/167.4) (x2 / 12 6 . 2 ) - (x2 / 126 . 2 ) 2 
=  1.645 (1- (-.260) 2) 
or 

(x12/28802.80) + ( .520x1x2/21125.88) + (x22/15926. 44) =1.534  . 

It  should  be  noted  that  this  confidence  ellipse 
is  based  on  the  one  step  ahead  forecast  for  all  storms 
that  have  passed  through  the  region,  so  it  is  fairly 
robust.  However,  locations  for  storms  that  exhibit  rapid 
changes  in  velocity  may  fall  outside  the  confidence 
ellipse  because  the  model  coefficients  are  based  on 
velocity-stationary  storms. 

Threshold  Point  Forecast  and  Example 

Forecast  models  have  been  developed  for  each  5 
degree  band  of  latitude  from  10  through  45  degrees  north 
latitude  (Appendix  A).  When  the  forecast  latitude  LA^ 
enters  a  new  latitude  band,  the  model  associated  with  that 
latitude  band  should  be  used. 

For  example,  position  reports  from  hurricane 
Henri  (September,  1985)  were  (21.0,64.0)  (21.4,65.0) 

(22.2,66.7)  (22.3,67.8)  (23.0,68.7).  Using  the  model  for 
20-25  degrees  north  latitude  yielded  the  following 
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forecasts:  (23.7,69.6)  (24.4,70.5)  (25.1,71.5)  .  At  the 
third  forecast,  Henri  was  predicted  to  enter  a  new  region, 
and  the  model  parameters  were  changed.  The  four  step 
ahead  forecast  was  made  using  the  new  parameters  for  25-30 
north  latitude.  That  forecast  was  (25.6,72.3). 

Threshold  Interval  Forecast  of  One  Variable 

The  time  series  model  developed  in  the  next  two 
sections  of  the  paper  is  similar  to  the  TARSO  model  (Tong 
and  Lim,  1980)  discussed  in  Chapter  Three.  There  are  two 
major  differences  between  the  TARSO  model  and  the  hurri¬ 
cane  model:  (1)  all  observations  on  which  the  forecast  is 
based  are  not  required  to  lie  in  the  same  region;  (2)  the 
noise  series  of  the  hurricane  model  are  allowed  to  co-vary 
at  the  same  lag,  while  Tong  and  Lim  assume  independence. 

In  addition,  Tong  and  Lim  do  not  discuss  the  distribution 
of  the  forecast. 

Definitions 

t-(k+l)  is  the  time  index  of  the  last  observed  position 
ti|>i}  i  =  0,...,k  is  the  set  of  old  grid  (Rj)  psi  weights 
j  is  the  index  of  the  region 

(£^3  i  =  0 , . . .  ,«.-k-l  is  the  set  of  new  grid  (Rj+1)  psi 
weights 

k  is  the  number  of  forecasts  in  Rj 

l  is  the  number  of  steps  ahead  from  the  last  observation 


et- (k+11  <£'k)  is  the  ^tn 
time  origin  t-(k+l) 


step  ahead  forecast  error  from 


at  *»,  N(0,oa2) 
nt  -v  N  (0  ,an2) 


LAt+j  is  observed  latitude  at  time  t+3 

LAt_1(«.)  is  the  £*"^  step  ahead  forecast;  time  origin  t-1 

LA.  is  the  the  first  forecast  value  in  R-,., 
u  -  2 + 1 

p  is  the  order  of  the  autoregressive  model 
C  4>  -l  3  is  the  set  of  autoregressive  parameters  for  Rj 
[$^3  is  the  set  of  autoregressive  parameters  for  Rj+1 

Consider,  for  example,  the  AR(1)  process  where  the 
true  model  is 

LAt  =  4'1LAt_1  +  at 

From  time  origin  t-£  ,  the  £t^1  step  ahead  forecast  for  LAt 
is 

LAt_£{£)  =  LAt_£(£-l)  . 

That  is,  the  £^h  forecast  depends  on  the  £-lst  forecast. 
Similarly, 

LAt_£U-i)  =  <J>  x  LAt_£  (  £-  2 ) 


ta. }  is  the  noise  series  from  R. 

L  j 

£ P 1 3  is  the  noise  series  from  Rj  +  1 


cov(a^.,nc.)  = 


0  t  *  s 

p0aan  =  s 
p  is  a  correlation  coefficient 


LAt-r£(1)  =  $1  LAt-£ 


where  LAt_£  is  the  last  observed  latitude.  The  the 
step  ahead  forecast  may  be  represented  in  terms  of  the 
last  observed  latitude  as 

tA  t_  £  (£)  =  <p  ^  ^  LA.(-_£  * 

Let  £=k+l.  Then 

LAt-£  =  LAt-(k+l) 

the  time  origin  in  Figure  4.2.  For  the  true  model, 
looking  forward  from  t-(k  +  l)  (i.e.  t-£ ) 

LAt-£+l  =  <f>iLAt-i,  +  at-JUl 
LAt-Jl  +  2  =  ^  lLAt-£,-l  +  at-£  +  2 

=  <P1  (<P]_LAt_z  +  <f,1at_£  +  jL)  +  at-£  +  2 

^  at-Z  +  3 

=  ^  1  ^  1  ^  1  ^  1.  ^  1  ^  t— £  + 1  +  2  +  ^t^Jc  +  3 

LAt  =  *l*LAt-A  +  * 1*  la t—  ( Jl- 1 )  +  '*'  +  +  at 

Then  the  fcth  step  ahead  forecast  error  is 

LAt  -  LAt_£  { % )  =  4>i*-1at_  (  j^-d  +  +  ♦xat-i  +  at 

=  ^£-lat-  ( £.-1 )  •*■•••  +  ^lat-l  +  ^oat  • 

The  order  of  the  AR  model  does  not  appear  in  the 
expression  of  the  error  in  terms  of  the  psi  weights. 
Indeed,  as  was  shown  in  univariate  forecasting,  the  psi 
weights  are  obtained  by  inverting  the  <j>  (B)  polynomial 


81 


which  generates  an  infinite  series  for  every  order  of  AR 
model.  Thus,  if  the  forecast  errors  are  represented  in 
terms  of  the  psi  weights,  the  analysis  of  the  forecast 
error  applies  to  all  orders  of  AR  models,  so  it  is 
functionally  sufficient  to  consider  the  AR(1)  process  in 
developing  the  approach. 

Referring  to  Figure  4.2,  consider  the  AR(1) 
forecast  of  LAt+1-  The  forecast  is  made  using  a  "new"  set 
of  parameters  from  Rj  +  ^.  From  time  origin  t,  for  the  true 
model 

LAt+l  =  *lLAt  +  nt+l  ‘ 

The  one  step  ahead  forecast  of  LAt+1  from  time  origin  t  is 

LAt(i)  =  *jLAt  . 

For  the  time  origin  t-1  the  true  model  is 

LAt  =  ♦1LAt_1  +  afc 

and  the  one  step  ahead  forecast  from  time  origin  t-1  is 

LAt_i ( 1 )  =  ♦iLAt.j 

and  the  forecast  error  is 

et_! (1,0)  =  LAt  -  LAt_1(l)  =  at  . 

For  two  steps  ahead, 

LAt+l  =  $lLAt  +  ^t+l 

-  ^l^lLA^-  —  ^  +  +  ^t  +  1 

LAt-l(2)  =  *iLAt_1<l) 

=  $14'1LAt_1 


et_!  ( 2  f  0 ) 


$lat  +  ^t+l  * 
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Similarly,  for  three  steps  ahead  (time  origin  still  t-1) 

LAt+2  =  $lLAt+l  +  ^t+2 

=  *1*1*1LAt+1  +  *i*!at  +  Vt  +  i  +  ^  t+  2 
LAt_1(3)  =  *1LAt_1(2) 

=  *l*l*lLAt-l 

et-l*3'0*  =  $l$lat  +  $lnt+l  +  nt+2 
In  general,  for  l  steps  ahead, 

et-l(£'0)  =  ?£-lat  +  ?£-2nt+l  +  *"  +  ?ont-l  +  £ 

For  time  origin  t-2,  and  for  one  step  ahead 

LAt-l  =  ^ lLAt-2  +  at-l 
LAj._2  (1)  =  <^>i^jAt-2 
et-2  tD  ~  i  * 

For  two  steps  ahead, 

LAt  =  ^  lXjAt_  i  +  at 

=  4>1<i>1LAt_2  +  'J’lat-l  +  at 
LA^_2(2)  =  <J>^LA^-^_2  (1) 

= 

et-2<2'1)  =  ♦lat-l  +  at 

For  three  steps  ahead  the  new  model  must  be  used.  Then 


LAt+ 1  =  $lLAt  +  nt+1 

=  4’i4li<<’]_LAt_2  +  +  $i  +  nt+l 

LAt_2(3)  =  $XLAt-2 ( 2 ) 

=  $iMlLAt-2 


et-2  ( 3 ' 1 )  -  $l$lat-l  +  $lat  +  ^t  +  l 


And  finally,  for  four  steps  ahead, 

LAt  +  2  =  $lLAt+l  +  r't+2 

s*l*l*l*lLAt-2  +  +  *i$lat  +  Vt+1  +  nt+2 

LAt_2(4)  =  $1LAt_2(3) 

=  *1*1<fr1*1LAt_2 

e£_2(4,l)  =  +  ^  i  ^  1 a  t  +  ^l^t  +  l  +  r't  +  2 

Three  important  general  conclusions  are  evident. 

(1)  The  5  weight  coefficients  of  the  a's  are  the  same 
within  a  particular  forecast. 

(2)  The  ip  weight  coefficients  of  the  a's  are  the  same  for 
all  steps  ahead.  In  other  words,  the  coefficients  of  the 
a's  depend  on  k,  the  number  of  forecasts  that  fall  in  Rj. 

(3)  The  C  weight  coefficients  of  the  n 's  are  the  usual 
psi  weights  for  a  "within"  grid  forecast. 

Thus,  by  a  two  term  induction,  (£  and  k) ,  if  the 
forecast  origin  is  t-(k+l)  and  the  first  forecast  in  Rj  +  1 
is  LAt,  then 

et-  (k  +  1)  =S£-k-l  t^kat-k  +  ^k-lat-k+l+  ’+'J,oat] 

+  ^-k-2r't  +  l  +  ?£-k-3nt  +  2+’"  +  ?oT^t-(k  +  l)+£  *  (4'36) 

Equation  (4.36)  gives  the  £tk  step  ahead  forecast  error 

for  the  univariate  threshold  model  when  k  forecasts  fall 

in  Rj ,  £>k+l.  a^vN (0 ,aa2)  and  n (o , on 2 ) .  So,  when  n 

is  sufficiently  large,  et-(k+l)^'^  :'-s  Normally 
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distributed.  The  mean  of  the  distribution  is 

E[et-(k  +  l)  (£'k) 5  =  5£-k-l C^kE  ^at-k] +^k-lE [at  +  k  +  l]  +  ”  ’ 

+  ^qE  [ 3-^ ]  ^  ^  ^£-k-2E^r't  +  l^  +  *  *  *+?0E^rit-(k+l)  +  £  ^ 
=  0  . 

Within  a  grid,  the  noise  series  does  not  co-vary  with  the 
lagged  values  of  the  same  series.  Also,  it  is  assumed 
that  different  noise  series  are  uncorrelated  at  different 
time  indexes.  Therefore,  since  each  of  the  noise  terms  in 
(4.36)  has  a  different  time  index,  there  is  no  covariance 
between  terms.  Consequently, 


E[et-  (k+l)  ^<*'k>  1  =  ?Vk-l  <*k^*k-l2+‘  "*o2>  °a" 

+  <5£-k-22  +  ^-k-32+’**  +  5oVn2 


(4.37) 


is  the  variance  of  the  error  distribution. 

Equation  (4.36)  is  easily  generalized  for 
multiple  threshold  crossings.  At  each  threshold,  all 
previous  psi  weights  are  multiplied  by  the  psi  weight  of 
the  new  model.  The  error  variance  in  (4.37)  is  merely 
increased  to  include  the  psi  weights  of  the  new  grid. 


Threshold  Interval  Forecast  Example  (Latitude) 


The  following  position  series  is  from  hurricane 
Frederic  (1979):  ((12.0,45.1)  (12.5,47.0)  (12.9,48.7) 

(13.3,50.4)  (13.8,52.3)  (14.3,54.1)  (14.9,55.5) 

(15.5,57.2)  (16.3,58.8)  (16.7,59.8)  (17.1,60.8) 


'V  '\  ^  \ 
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(17.5.61.8)  (17.8,62.8)  (18.0,63.8)  (18.1,64.8) 

(18.1.65.8)  (18.1,66.8)3 

Suppose  13.8  is  the  last  observed  latitude.  Then  fore¬ 
casting  with  a  lead  time  of  72  hours  via  the  appropriate 
model  yields  the  following  forecasts: 

((14.2,54.1)  (14.6,55.9)  (15.1,57.6)  (15.4,59.1) 

(15.8,60.6)  (16.2,62.1)  (16.6,63.5)  (17.1,64.8) 

(17.5,66.1)  (18.0,67.4)  (18.4,68.6)  (18.8,69.8)3  . 

For  latitude,  the  48  hour  forecast  of  17.1  has  estimated 
variance 

V[e5(8,2)]  *  C52[^22  +  ^l2+,<'o2]cra2+(542+'*'+?o2)an2  * 

For  the  grids  10-15N  and  15-20N  the  variance  estimates  are 

Sa2  =  ( 8 . 7) 2  =  75.7  ,  5n2  =  (11.2) 2  =  125.4  . 

From  the  univariate  forecasting  confidence  interval 
example  $0=1,  $1=1.696,  $ 2=2.179  .  For  the  new  model 

LAt=l. 766LAt_1-. 776LAt_2-.106LAt_4+. 106LAt_5 
-. 010LOt_1+ .010LOt_2+ .073LOt_4-.073LOt_5 . 

For  this  example  ignore  the  longitude  coefficients. 

This  yields 


$o  =  l,  $!  =  1 • 766  ,  1 2  =  1 • 766 ( )  +  (- • 766 )  (?0)=2.353. 


Similarly, 


£3=2.813,  £4=3.059,  and  £5=3.166  . 

Then  the  estimated  eight  step  ahead  error  variance  and 
standard  deviation  are 

V[e5 (8, 2) ]=10.023(8. 624) 75. 7+ (26. 926) 125. 4=99 19. 904 
SD[e5  (8,2) ] =99.6  n  mi  . 

Threshold  Interval  Forecast  of  Two  Variables 


Definitions 


t-(k+l)  is  the  time  index  of  the  last  observation 

i=0,l,...,k  is  the  ith  set  of  psi 
weights  for  the  old  region  Rj 

Subscript  (11, i)  is  latitude  predicting  latitude  at  lag 
Subscript  (12, i)  is  longitude  predicting  latitude 
Subscript  (21, i)  is  latitude  predicting  longitude 
Subscript  (22, i)  is  longitude  predicting  longitude 

j  is  the  index  of  the  region 


i  =  0 , 1 , . . . ,  A-k-1  is  the  i^  set  of  psi 
weights  for  the  new  region  Rj+1 

k  is  the  number  of  forecasts  in  Rj 

l  is  the  number  of  steps  ahead  from  the  last  observation 

el,t-(k  +  l)(  *>k)  is  the  step  ahead  forecast  error  for 

latitude  from  time  origin  t-(k+l) 

e2,t-(k+l)(  *>k)  is  the  fc th  step  ahead  forecast  error  for 

longitude  from  time  origin  t-(k+l) 

alt  t  =  0,l,...,k  is  the  latitude  noise  series  from  Rj 


*11, i  ?12,i 


*11, i  *12, i 

t  o  1  -I  t  O  O  i 


a2t  t  =  0,l,...,k 

is  the 

longitude 

noise  series  from 

ait  '  N(0-«ai2' 

i  =  l,2 

nlt  t  =  k  +  l,...,n 

is  the 

latitude 

noise  series  from  Rj+^ 

n2t  t  =  k+l,...,n 

is  the 

longitude 

noise  series  from  Rj+^ 

"it  %  N(0,ani2) 

i  =  l » 2 

cov(ait'njs)  = 

0 

t 

*  s 

P°ai0 

n  j  1 

=  s 

LA 


t+3 


I _ LOt+3- 


is  observed  latitude  and  longitude  at  time  t+3 

nth 


LAt_1(£)  J  is  the  £  step  ahead  forecast  from  time  t-1 
L0t_1(£) j 
~LA  ' 


LO, 


is  the  first  forecast  to  fall  in  Rj+1 


— 1 

”*11 

*12 

— *21 

*22 — 

~*11 

*  1 2 

-*21 

*21- 

are  the  lag  1  AR  coefficients  for  R. 


are  the  lag  1  AR  coefficients  for  Rj+^ 
Consider  the  AR(1)  bivariate  process  where 


LAt  -  +  4li2^J®t-l  +  alt 

L°t  =  *21LAt-l  +  ^22LOt-l  +  a2t 


This  may  be  written  in  matrix  form  such  that 


▼ 
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LAt 

*11 

*12 

LAt-l 

+ 

alt 

_LOt_ 

— ( *21 

*  2  2 — 

_LOt-l- 

— a2t — 

' 


Referring  to  figure  4.2,  consider  the  bivariate  forecast 
of  (LAt  +  ^,L0t  +  2)  •  The  forecast  is  made  using  a  "new"  set 
of  parameters  from  Rj+^.  From  time  origin  t,  for  the  true 


model 


LAt+l 

*11 

$12 

_LOt+l— 

_*21 

*22- 

LAt 

nlt+l 

+ 

_L0t_ 

— n2t+l_ 

The  one  seep  ahead  forecast  is 


‘v 

LAt(l) 

II 

*11  *12 

LAt 

1 

__L0t  ( 1 )  _ 

1  1 

— *21  *22— 

_LOt- 

For  a  time  origin  of  t-1,  for  the  true  model 


LAt 

_ 

*11 

*12 

LAt-l 

+ 

1 

M 

<1* 

_ 1 

-LOt- 

— *21 

*22- 

-LOt-l- 

— a2t— 

the  one  step  ahead  forecast  is 


LAt_x (1) 

*11 

*12 

i 

LAfc_- 

-LOt-l(l>_ 

_*21 

*22 _ 

_L°t-l__ 

and  the  forecast  error  is 


» 

>* 

eit-i 

LAt"  j_ 

LAt-id) 

alt 

i 

_ e2t-l  ( 1  f  0)_ 

LLOtJ 

[__LOt-l  (1) _ 

_ a2t — 

For  two  steps  ahead, 


V 

V 

LAt+l 

$11  $12 

LAt 

+ 

nlt+l 

i 

— LOt+l_ 

_ ®  2 1  $2  2_ 

_ L°t _ 

_ n2t  +  l _ 

AD-A17D  742 
UNCLASSIFIED 


TINE  SERIES  PREDICTION  OF  HURRICANE  LANDFALL(U)  AIR 
FORCE  INST  OF  TECH  NR I QHT-PATTERSON  AF8  OH  T  F  CURRV 
HAV  SS  AFIT/CI/NR-86-75D 

F/B  4/2 


2/2 
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$H  $i2 
_*21  *22- 


'11  *12 


’21  9  2  2 


LAf.i 


Mt  +  l 

-n2t  +  l- 


Then  the  forecast  error  is 


elt-l(2,0)~l  pH  41 1 2  I  alt  I  n  1 1+ 1 

=  + 

— e2t-l  (2'0)_J  L$21  *22—1  a2t Ln2t+1— 

For  time  origin  t-2,  for  one  step  ahead 


LAt_i 


*11  *12 


LAt-2 


alt-l 


LL0t-l_J  1 _ *21  ^  2  2 _ I  LL0t-2_J  La2t-1 


and  the  forecast  error  is 


elt-2  drl) 

e2t-2 


alt-l 

a2t-l — 


For  two  steps  ahead  (time  origin  still  t-2) 


’ll  *12 


21  922- 


LAt-l 


*i2~i  r *11  *12  r LAt_2n  rait-in  r~ait 

+  + 

— *21  *22—  — *21  *22 — i  — LOt-2 — '  La2t-2— <  <—a2t— 


* — *21  *22 — I  1 — *21  *22—1 

and  the  forecast  error  is 


elt-2 (2,1 ) 
e2t-2  d,l) 


*11  *12 
*21  *2  2. 


alt-l 

a2t-l 


For  three  steps  ahead,  the  new  model  should  be  used.  Then 


LAt  +  l 
-LOt+l- 


$H  $12 


nlt  +  l 


1 — *21  *22— J  I — LOt — 1  l — n  2 1+ 1 — I 

*11  *12  *11  *12  *11  *12  LAt-2 

— *21  *2  2 —  *21  *22—1  — *  21  *22—1  I— LOt-2— 
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$11  *12 

*11  *12 

alt-l 

*21  *22— 

— *21  *22— 

— a2t-l— 

*11  *12 

alt 

*21  *22— 

— a2t— 

n 


lt  +  1 


1— n2t  +  l—l 

These  results  are  the  multivariate  analogy 
identical  to  the  univariate  case,  except  they  are  in 
matrix  form.  Thus,  the  conclusions  are  similar.  If  the 
forecast  origin  is  t-(k+l)  and  the  first  forecast  in  Rj+1 
is  (LAt,LOt) ,  then  the  £th  step  ahead  forecast  error  is 
given  by 


elt-  (k+1)  U,k) 

^11 ,  £-k-l  512,£-k-l 

— e2t-  (k  +  1)  U,k)  — 

— C21,£-k-l  ^ 12 , £-k-l — 

*11, k  *12, k 


l— *2I,k  *22  ,k _ I 


lt-k 
-a2t-k — I 


+  •  •  •  + 


*ll,o  *12,o 


“It 

L*21,o  *22, ° — I  I a 2 1 I 


5 11 , £-k-2  ?12,£-k-2 

nlt  +  l 

—^21 , £-k-2  ?22,£-k-2_ 

_n2t+l_ 

cll,o  512,o  | 

nlt-  (k  +  1) +£ 

1 

— ^ 21,o  ^22,  o — ! 

Lr'2t-  (k  +  1)  +£_ 

(4.38) 


The  psi  weights  for  (4.38)  are  the  values  calculated  from 
(4.29)  and  (4.30).  As  in  the  univariate  case,  the 


individual  forecast  errors  are  a  linear  combination  of 


independent  Normally  distributed  noise  terms,  so  they  are 
Normally  distributed,  and  the  distribution  of  the  bivar¬ 
iate  threshold  forecast  is  bivariate  Normal.  The  covar¬ 
iance  matrix  for  the  £th  step  ahead  forecast  can  now  be 
calculated.  From  equation  (4.38), 

elt-  (k+1)  (£'k)  =  (?ll,£-k-l’*’ll,k+Sl2,£-k-1^21  ,k)alt-k 

+  (Cll,il-k-1^12,k  +  512,£-k-l'l'22,k)a2t-k 

+  •  •  • 

+ (?ll,A-k-l*ll,o+*l2,*-k-l*21,o)alt 
+  (5ll,£-k-l1<'l2,o  +  ?12,£-k-l’,'22,o,a2t 
+  ^1 1 ,  £-k-2rj  1 1  + 1 +  ^  1 2,£-k-2r,2t+l^  +  *  "  ’ 

+ (5ll,onlt- (k+1) +£+?12,on2t- (k+1) +£) 

For  notational  convenience,  this  can  be  rewritten  as 

elt- (k  +  1)  (£'k) =clt-kalt-k+c2t-ka2t-k+  * ’ ’ +C1 tal t  +  c2ta2 1 

+clt  +  lnlt+l  +  c2t  +  ln2t+l  +  * ' *+clt- (k+1) +£nlt- (k  +  1) +£ 

+c2t- (k+1) +£n2t- (k+1) +£ 

Then , 

E[elt-(k  +  l)2(£'k)]  =  (clt-k2  +  "*+clt2)0al2 
+(c2t-k2+*',+c2t2)°a22 
+  2(clt-kc2t-k+*”+cltc2t)cov(alta2t) 

+  (Cit+i2+--*+Cit_(k  +  i)+Jl2)oni2 
+  (c2t  +  l2  +  '**+c2t-(k  +  l)  +  £2)  an2  2 

+  2 (clt  +  lc2t  +  l+’ ' ’ +clt- (k  +  1) +£c2t-  (k+1) +£)cov (nltn2t)  -  (4.39) 


Similarly, 


e2t-  (k  +  1 )  "  ^21,£-k-l^ll,k  +  ^22,£-k-l5  21  ,k)alt-k 

+ (C21,£-k-l512,k+C22,£-k-l522,k)a2t-k 

+  •  *  * 

+ (C21,£-k-l5llfo+522,£-k-lC21,o,alt 
+ (521 ,£-k-l?12,o+?22,£-k-l?22,o) a2t 
+  <*21,£-k-l5lt  +  l  +  $22f£-k-2?2t  +  l)  +  “  ' 

+  (?21,onlt-(k  +  l) +£+c22,on2t-(k+l) +  £>  ' 

and  this  may  be  rewritten  as 

E[e2t-(k+l)2<^k)J  =  (dlt_k2+*-*+dlt2)aal2 
+<d2t-k2+***+d2t2)  a22  0 

+  2(dlt-kd2t-k+’"+dltd2t)covfalta2t] 

+  (dlt+l2+*  ”+dlt-(k+l)  +  £2)  °nl2 
+ (d2t+l2+* ’ ’+d2t-(k+l) +£2) °n22 
+2 (dlt+ld2t+l+ 

+dlt-  (k  +  1)  +  £d2t-  (k  +  1 )  +  £ )  cov  ltr>2t  ^  '' 

so , 

E  [ e  1 1-  ( k+ 1 )  (  ^  / k)  e2t-  (k+1)  Urk)  ] 

=  (cit-kdlt-k+*  "+catdlt)°al2 

+  C (clt-kd2t-k+*  *  *  +C1 td2t ) 

+ (c2t-kdlt-k+ ’ * ,+C2tdlt) 3cov[alta2tJ 
+ (c2t-kd2t-k+* ' ’+c2td2t)°a22 
+  (cit+ldit  +  l+  ’  *  *+cit-(k+l)  +  *dlt-(k  +  l)  *1)  °ril2 


(4.40) 


+  [ (clt  +  id2t  +  l+*  *  *+clt-(k+l) +£d2t-  (k+1) +*' 

+  (c2t+ldlt+l+  *  "  +c2t-  (k+1)  +£dlt-  (k+1)  +«->  }cov[n1n2l 

+  (c2t+ld2t  +  l+* ’ *  +c2t- (k+1) +£d2t- (k  +  l)+£)on22  *  (4‘41) 

Although  these  calculations  are  tedious,  it 

may  be  noted  that  equations  (4.39)  through  (4.41)  are 

the  Kronecker  product  of  the  error  terms  defined  by  (4.38) 

where  the  Kronecker  product  is 


el 

H  Ol  e2lH  = 

el2  ele2 

0 

— e  2 _ 

— ele2  e2  J 

(4.42) 


The  expected  value  of  (4.42)  is  the  covariance  matrix  of 
the  forecast  of  interest. 

The  case  of  multiple  threshold  crossings  is  easi¬ 
ly  handled.  The  first  psi  weight  matrix  of  the  new  grid 
is  used  to  multiply  all  previous  psi  weight  matrices. 

Thus,  the  constants  in  the  2  by  2  matrices  of  equation 
(4.38)  change,  and  an  additional  sum  is  needed  to  account 
for  the  new  grid.  For  example,  in  the  case  of  two  thres¬ 
hold  crossings  (the  most  that  would  normally  be  expected 
in  72  hours)  where  there  are  k^,  k2r  and  k^  forecasts  in 
the  first,  second,  and  third  grid  respectively 
(ki  +  k2+k3=  i)  »  4*  S'  and  B  dre  2  by  2  psi  weight  matrices, 
and  a,  n ,  and  b  represent  the  applicable  noise  terms,  a 
notationally  simplified  version  of  (4.38)  is  given  by 


[_e2 _ I  L  3  J  L  ~2J  l  L  U  La2 _ I  L  J  La2 — I 


Equations  (4.38)  -  (4.41)  can  be  used  to 
determine  the  confidence  ellipse  around  a  forecast  that 
crosses  a  grid  boundary.  The  equations  are  the  theoreti 
cal  contribution  of  this  research. 

Bivariate  Threshold  Interval  Forecast  Example 

The  same  hurricane  track  is  used  in  this  exampl 
(hurricane  Frtderic  1979).  It  should  be  noted  that  the 
correlation  and  standard  deviations  are  based  on  all 
storms  rather  than  only  category  five  storms.  This 
increases  the  standard  deviations  used  in  the  confidence 
ellipse  by  approximately  40  percent  due  to  the  inclusion 
of  accelerating  storms.  The  increase  affords  additional 
protection  when  forecasting  storms  that  are  not  category 
f  ive . 

Details  of  the  following  calculations  used  to 


compute  the  90%  confidence  ellipse  are  in  Appendix  C. 
Psi  weight  matrices  and  covariance  matrices  for  the  one 


step  ahead  forecast  for  each  grid  are  in  Appendix  B.  For 
this  example,  Frederic  is  forecast  from  origin  (13.8,52.3) 
which  is  the  fifth  report  of  the  series  (time  origin  5). 
Two  forecasts  fall  in  the  old  region  from  10-15N  (k=2). 

The  following  procedure  is  used  to  defir  a  the  90%  confi¬ 
dence  ellipse  centered  at  the  eight  step  ahead  forecast: 

(1)  Compute  the  point  forecast  using  the  models  in 
Appendix  A.  Here  the  point  forecast  is  (17.1,64.8). 
Note  the  number  of  forecasts  that  fall  in  the  old 
region  (k=2). 

(2)  Write  the  models  in  <p  (B)  polynomial  form  (see 
Appendix  C) . 

(3)  Compute  the  psi  weight  matrices  for  each  region  the 
storm  crosses  up  through  the  eleventh  term.  Eleven 
matrices  (12  including  ^0)  are  needed  to  compute  the 
variance  of  the  12  step  ahead  (72  hour)  forecast. 

(4)  Let  Ea  be  the  covariance  matrix  of  the  one  step  ahead 
forecast  errors  for  the  old  region.  En  is  the 
covariance  matrix  of  the  one  step  ahead  forecast 
errors  for  the  new  region.  Form  the  Kronecker 
product  of  (4.38)  as  shown  in  (4.39)  through  (4.41) 
and  Appendix  C.  This  yields  Lgr  the  estimated 
covariance  matrix  of  the  eight  step  ahead  forecast. 

(5)  Compute  the  eigenvalues  Alf  and  A2  (A2  is  the 
largest)  and  eigenvectors  for  Eg.  Compute  the 


estimated  coeff icient  of  correlation  (p )  for  L  g* 

Then  the  approximate  99%  confidence  ellipse  for  this 
example  is 

(x1/168.6)2-2(.076) (xj/168.6) (x2 / 1 1 0 . 9 ) + ( x2 / 1 1 0 . 9 ) 2 
=  1.645(1— (-076) 2) 

where  the  minor  axis  is  oriented  on  a  magnetic 
heading  of  4.99  degrees  and  is  centered  at  the  point 


forecast . 


CHAPTER  5 


FORECASTING  RESULTS 

The  results  of  applying  the  models  in  Appendix  A 
to  historical  hurricane  tracks  are  presented  in  this 
chapter.  The  analysis  that  led  to  the  use  of  velocity¬ 
stationary  (category  5)  models  is  discussed.  Next,  the 
actual  empirical  models  are  presented  and  compared  with 
NHC  official  forecasts,  and  forecasts  for  storms  from  1945 
through  1983.  Then  the  results  for  1985  storms  are 
discussed . 

Predicting  the  Stationarity  Category 

Transition  matrices  were  investigated  as  a  means 
of  forecasting  the  correct  future  stationarity  category. 

A  window  of  seven  previous  six  hour  observations  from  each 
track  was  used  to  develop  the  matrices.  Segments  of  longer 
length  missed  short  term  accelerations,  and  segments  of 
shorter  length  resulted  in  large  variances  in  the  parame¬ 
ter  estimates.  At  each  new  position,  the  stationarity 
category  was  determined.  For  example,  at  the  seventh 
position  report,  the  stationarity  category  was  based  on 
positions  one  through  seven.  At  the  eighth,  the  station- 


arity  category  was  based  on  positions  two  through  eight, 
etc..  In  this  manner  a  storm  category  (direction  of 
movement)  was  associated  with  the  terminal  point  in  each 
"window".  Numerical  results  of  this  analysis  are 
presented  in  Appendix  E. 

A  transition  matrix  was  then  constructed  based  on 
a  previous  category.  For  example,  regardless  of  which 
category  the  storm  was  in  at  position  seven,  it  was  most 
frequently  in  category  five  at  later  positions.  This 
implied  that  any  acceleration  of  the  storm  was  short 
lived.  Consequently,  for  six  hour  data  a  good  guess  of 
the  future  category  of  any  hurricane  would  be  latitude- 
velocity,  longitude-velocity  stationary. 

The  Forecast  Models 

Forecasts  were  made  using  the  category  five  model 
in  each  latitude  band.  The  model  is  bivariate  AR(4)  in 
velocity  where  latitude  and  longitude  are  functions  of 
each  other  at  lags  one  and  four.  The  model  uses  five 
previous  positions  (differenced  once  to  obtain  four 
velocities),  and  thus  adapts  itself  to  the  changing  motion 
of  the  storm. 

There  are  definite  relationships  between  the 
model  structure  and  known  meteorological  phenomena  asso¬ 
ciated  with  hurricanes.  Specifically,  segmenting  by 
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latitude  band  relates  to  large  scale  climatology  differ¬ 
ences  observed  as  the  distance  from  the  equator  increases. 
The  lag  one  parameter  captures  persistence,  and  the  lag 
four  parameter  models  the  diurnal  effect  of  the  sun  (the 
slowing  of  the  storm  at  night). 

Probably  the  most  significant  result  is  that  the 
forecast  parameters  are  based  only  on  velocity-stationary 
hurricanes.  It  is  not  valid  to  use  accelerating  storms 
in  a  velocity  model.  Admitting  such  storms  to  the  data 
base,  when  forecasting  the  next  change  in  position,  biases 
the  parameter  estimates  and  increases  the  variance  of  the 
forecast  error. 

The  forecast  models  for  the  five  degree  latitude 
bands  from  10  degrees  north  latitude  to  45  degrees  north 
latitude  are  listed  in  Appendix  A.  Three  digits  after  the 
decimal  are  reported  (even  when  insignificant)  so  that  the 
form  of  the  models  remains  the  same  in  each  band.  A 
parameter  standard  deviation  is  reported  for  the  lag  one 
parameter  value  even  though  it  was  derived  directly  from 
the  lag  two  value  (by  adding  the  constant  one).  The  lag 
one  parameter  value  increases  as  the  storm  moves  north  and 
remains  fairly  constant  above  25  degrees  north  latitude. 
Also,  latitude  seems  to  be  a  better  predictor  of  longitude 
as  opposed  to  predicting  via  the  reverse  relationship.  In 
general  the  lag  four  relationships  are  weak,  but  they  are 


significant  for  some  regions. 

To  use  the  models,  the  most  recent  position 
report  becomes  (LAt_lf  and  the  forecast  (LAt,  L0t) 

is  calculated  for  a  six  hour  leadtime.  Twelve  to  72  hour 
forecasts  are  then  obtained  by  treating  (LAt,  L0t)  as  the 
last  true  position  and  then  forecasting  another  6  hour 
increment.  When  LAt  moves  into  a  new  latitude  band  the 
model  associated  with  that  latitude  band  should  be  used. 


Great  Circle  Distance 


A  popular  measure  of  forecast  accuracy,  and  the 
measure  used  in  this  dissertation,  is  the  great  circle 
distance  between  the  actual  and  the  forecast  position  of 
the  eye  of  the  hurricane  (Pike,  1985).  Let  (LAF,LOF)  be 
the  latitude  and  longitude  of  the  forecast  position,  and 
let  (LA,LO)  be  the  actual  position.  Then  the  great  circle 
distance  in  nautical  miles  between  the  two  points  is 
60  cos“^ [sin (LAF) sin (LA) +  cos (LAF)cos (LA) cos (LO-LOF) ] 


Forecast  Results 


Overall  forecast  errors  for  all  storms  from  1945 
through  1983  in  each  latitude  band  were  computed  using 
great  circle  distance  (Table  5.1).  Sample  standard  devia¬ 
tions  (n  mi)  and  sizes  are  reported  below  each  mean  error. 
Similarly  to  Neumann  and  Pelissier  (1981),  the  average 


error  was  computed  for  the  southern  (10-25N)  and  the 
northern  (25-45N)  region  and  then  summarized  (Table  5.2). 
Results  are  compared  with  the  NHC  average  errors.  The 


TABLE  5.1 


TIME  SERIES  BEST  TRACK  MEAN  FORECAST  ERRORS  (N  MI (KM)) 


Forecast  interval  (hours ) 


Region 

24 

48 

72 

10-15N 

57.9(107.3) 

131.9  (244.4) 

204.8(379.5) 

45-83W 

(35.1,190) 

(75.7,103) 

(118.1,50) 

1 5- 2  ON 

|  70.6(130.8) 

154.6(286.5) 

253.0(468.8) 

45-87W 

(47.6,721) 

(109.4,403) 

(160.0,221) 

20-25N 

]  76.6(142.0) 

165.6(306.9) 

260.1(482.0) 

45-100W 

(49.9,956) 

(96.4,519) 

(135.2,279) 

2  5- 3  ON 

|  96.2(178.3) 

203.2(376.6) 

293.9  (544.7) 

45-100W 

(66.6,889) 

(131.6,448) 

(163.7,223) 

30-35N 

|  107.7(199.6) 

242.4(449.2) 

358.6(664.5) 

45-80W 

(67.2,598) 

(138.0,365) 

(188.9,230) 

35-40N 

|  121.0(224.2) 

267.5  (495.7) 

391.8(726.0) 

45-76W 

(74.3,344) 

(132.8,176) 

(199.5,79) 

40-45N 

|  138.7(257.0) 

269.8(500.0) 

206.3(382.3) 

45-70W 

(79.7,106) 

(178.1,47) 

(203.9,21) 

time  series  storm  tracks  are  not  the  same  as  those  used  by 
Neumann  and  Pelissier  because  the  time  series  sample  is 
larger.  However,  time  series  analysis  of  1973-1979  storms 
in  the  southern  region  (a  subset  of  the  1945-1983  data 
base)  produced  deviations  of  less  than  10  n  mi  from  the 
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reported  average  errors  for  the  entire  sample  (Table  5.2). 
Thus,  comparisons  should  be  fairly  accurate.  Assuming 
equivalent  NHC  standard  deviations  and  sample  sizes,  the 
time  series  (TS)  errors  (Table  5.2)  are  significantly  less 
(95%  confidence)  than  the  NHC  errors  at  48  and  72  hours. 

It  should  be  noted  that  TS  results  are  based  on 
best  track  data  instead  of  operational  position  reports. 
Best  tracks  are  constructed  in  a  careful  post-storm 
analysis  that  combines  position  data  from  all  available 
sources.  Some  subjective  smoothing  is  then  employed  to 
plot  the  best  track.  Neumann  observed  that  the  average 
error  for  real  time  position  reports  is  approximately 

TABLE  5.2 

COMPARISON  OF  MEAN  FORECAST  ERRORS  (N  MI (KM) ) 


Forecast  interval  (hours) 


Grou 


Source 


1. 

Entire  sample 

NHC 

110.1(204) 

24  4.0(452) 

362.0(671) 

TS 

109.8(203) 

214.6(398) 

312.0(578) 

2. 

Northern  region 

NHC 

131.3(243) 

304.4(564) 

421.0(780) 

TS 

126.5(234) 

251.0(465) 

351.5(651) 

3. 

Southern  region 

NHC 

84.5(157) 

179.2(332) 

317.3(588) 

TS 

92.4(171) 

177.9(330) 

272.2(504) 

20 

n  mi  (Barney,  1983). 

In  order  to 

approximate  true 

error,  the  time  series 

errors  listed 

in  Table  5 

.2  include 

an 

additional  20  n 

mi  error  over  the 

actual  value  obtained 

using  the  model. 

To  better  compare  the  procedures,  the  five  storms 
in  Table  2.1  were  forecast  using  operat iona 1  position 
reports.  The  average  forecast  errors  were  94.8(176), 
163.1(302),  and  227.4(421)  n  mi  (km)  as  compared  to 
77.3  (143),  179.4  (332),  and  317.3(588)  for  the  NHC.  The 
differences  are  significant  at  the  95%  confidence  level  at 
24  and  72  hours.  This  analysis  also  confimed  that  20  n  mi 
is  an  appropriate  amount  to  add  to  the  best  track 
forecasts  to  account  for  operational  position  error. 

In  1985  there  were  eleven  hurricanes  and  tropical 
storms  in  the  North  Atlantic  region.  The  National  Weather 
Service  office  in  Austin  issued  position  reports  for  eight 
of  them.  Two  of  the  eight  storms  were  at  sea  for  less 
than  five  position  reports  and  could  not  be  forecast  by 
the  time  series  model.  The  remaining  6  storms  yielded  55 
24  hour  forecasts,  31  48  hour  forecasts,  and  10  72  hour 
forecasts.  The  average  forecast  errors  for  the  time  series 
model  were  118,  273,  and  241  n  mi  respectively,  indicating 
that  the  storms  were  not  category  five.  Indeed,  looping 
and  accelerating  storms  caused  the  NHC  to  alert  wide  areas 
of  the  Gulf  coast  where  four  of  the  storms  came  ashore. 
Forecast  accuracies  for  the  NHC  will  not  be  available 
until  late  in  1986. 

It  is  evident  that  the  time  series  approach  is 
valuable  in  reducing  forecast  error  especially  for  the 


long  range  forecast.  The  fact  that  it  rivals  the  official 
forecast  accuracy  of  the  National  Hurricane  Center  is 
significant.  This  statement  is  made  in  consideration  of 
the  fact  that  the  time  series  model  has  used  only  latitude 
and  longitude  as  predictor  variables,  while  the  NHC  has 
available  a  plethora  of  predictive  tools.  Incorporation 
of  exogenous  variables  into  the  time  series  model,  and 
various  extensions  discussed  in  Chapter  6,  can  only  be 
expected  to  improve  the  forecast  accuracies. 


CHAPTER  6 


SUMMARY  AND  CONCLUSIONS 

Both  the  theoretical  and  application  oriented 
results  of  Chapters  4  and  5  are  summarized  in  this 
chapter,  as  they  relate  to  hurricane  forecasting  and 
threshold  autoregressive  time  series  models.  Topics  for 
future  research  are  also  discussed. 

Hurricane  Modeling  Results 

The  original  objectives  of  this  research  were: 

(1)  to  predict  hurricane  movement  based  on  time  series  of 
observations  of  latitude  and  londitude  taken  at  six  hour 
intervals,  (2)  to  determine  the  confidence  intervals 
surrounding  the  forecasts,  and  (3)  to  use  a  model  whose 
coefficients  could  change  as  the  storm  moved.  The  models 
were  to  be  categorized  by  time  of  year  and  region  of  the 
ocean  in  which  the  storm  was  located. 

The  first  problem  encountered  in  the  research 
involved  construction  of  the  data  files.  Segmenting  the 
hurricane  tracks  by  region  resulted  in  several  short 
bivariate  time  series,  each  containing  approximately  10-14 
observations.  These  tracks  could  not  be  directly 
concatenated,  because  the  last  observation  of  one  storm 
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would  have  been  used  to  predict  the  first  observation  of 
the  next  storm  which  was  intuitively  incorrect.  The 
lagged  data  needed  to  be  "storm  unique."  The  commercially 
available  time  series  computer  programs  were  unable  to  lag 
the  data  from  the  segmented  tracks  in  the  required  manner. 
A  FORTRAN  program  was  written  to  lag  the  data  six  time 
periods,  and  SPSS  was  used  to  compute  regression 
coefficients . 

The  initial  forecasting  model  (in  which  the 
position  forecasts  were  based  on  past  positions)  was 
nonstationary.  The  latitude  and  longitude  series  were 
differenced  and  the  coefficients  recomputed.  Two  findings 
were  immediately  apparent.  The  level  of  differencing  of 
the  latitude  and/or  longitude  series  required  to  satisfy 
stationarity  resulted  in  nine  possible  stationarity  cate¬ 
gories  for  the  storm.  These  categories  were  related  to 
direction  of  movement  (Table  4.1).  In  addition,  parame¬ 
ters  significantly  different  from  zero  occurred  at  lags 
one  and  four.  It  was  believed  that  these  parameters 
related  to  known  meteorological  phenomena  associated  with 
hurricanes.  Specifically,  the  lag  one  parameter  captured 
persistence,  and  the  lag  four  parameter  modeled  the  diur¬ 
nal  effect  of  the  sun  (the  slowing  of  the  storm  at  night). 

For  storms  that  were  velocity-stationary  (the 


latitude  and  longitude  series  were  differenced  once)  the 
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last  change  in  storm  position  was  the  best  predictor  of 
the  next  change  in  storm  position  regardless  of  location 
or  time  of  year.  Segmenting  data  by  month  increased 
forecast  accuracy  for  some  months,  but  resulted  in  an 
overall  decrease  in  average  forecast  accuracy.  Time  of 
year  was  discarded  as  a  predictor  variable.  By  trial  and 
error  it  was  determined  that  segmenting  the  North  Atlantic 
by  latitude  bands  eight  degrees  in  width  with  a  three 
degree  overlap  resulted  in  the  smallest  forecast  errors. 
This  led  to  a  "threshold"  approach  in  which  parameter 
values  were  allowed  to  change  when  the  storm  crossed  into 
a  new  region.  There  were  nine  models  for  each  region 
depending  on  the  stationarity  category  of  the  storm. 
Forecast  errors  based  on  the  velocity-stationary  models 
were  small. 

In  order  to  increase  forecast  accuracy,  empirical 
transition  matrices  were  computed  in  a  Markov  chain 
approach  to  modeling  the  transition  of  storms  between 
categories  (Appendix  E).  It  was  determined  that  future 
storm  categories  were  usually  category  five  (velocity¬ 
stationary)  regardless  of  the  current  category.  This 
implied  that  storm  accelerations  were  short  lived.  Thus, 
a  velocity  model,  especially  one  based  on  the  last  change 
in  position,  was  expected  to  accurately  forecast  all  cate¬ 
gories  of  storms,  at  least  for  6-hour  position  report  data 
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For  the  time  series  model  the  average  error  for 
the  24  hour  forecast  is  110  n  mi.  The  official  forecasts 


of  the  NHC  have  the  same  average  error.  At  a  72  hour  lead 
time,  the  average  error  of  the  time  series  model  is  312 
n  mi  as  compared  to  362  n  mi  for  the  NHC.  The  accurate 
long  range  forecast  is  probably  due  to  two  important 
characteristics  of  the  time  series  model.  For  a  station¬ 
ary  model,  the  "importance"  of  the  last  observed  velocity 
decays  exponentially  as  the  forecast  lead  time  increases. 
Thus,  if  the  last  velocity  observation  is  inaccurate,  its 
effect  diminishes  rapidly.  This  rapid  decay  causes  the 
long  range  forecasts  to  be  based  primarily  on  the  mean  of 
the  latitude  and  longitude  series,  which  is  the  average 
motion  vector  of  past  storms  in  the  region.  This  suggests 
that,  on  the  average,  in  particular  regions  of  the  ocean, 
hurricanes  move  in  the  same  direction. 

Future  Hurricane  Modeling 

In  this  study,  the  velocity-stationary  model  is 
used  to  predict  movement  of  all  hurricanes.  This  is  a 
very  "broad  brush"  approach.  Although  category  five 
occurs  most  frequently,  there  are  eight  other  categories 
into  which  a  storm  can  be  classified.  The  typical  storm 
spends  only  30-40  percent  of  its  life  as  a  category  five 
storm.  The  rest  of  the  time  the  storm  is  accelerating. 


The  key  to  more  accurate  movement  prediction  is  the 
ability  to  forecast  these  accelerations.  This  is  an 
extremely  difficult  problem  whose  solution  has  always 
eluded  hurricane  scientists.  Predictors  such  as 
surrounding  pressure  fields,  maximum  wind  velocity,  upper 
level  steering  winds,  and  central  pressure  seem  to  be 
useful  for  some  storms,  but  not  for  others.  It  is 
possible  that  certain  of  these  exogenous  variables  are 
good  leading  indicators  only  for  storms  in  a  particular 
category. 

It  is  also  possible  to  make  better  use  of  the 
category  transition  matrices.  There  are  nine  models  for 
each  grid.  Another  forecasting  approach  would  be  to 
examine  the  percentage  of  storms  that  (say,  beginning  in 
category  five)  are  in  each  of  the  categories  48  hours 
later.  48  hour  forecasts  could  then  be  calculated  by 
combining  forecasts  from  each  of  the  nine  models,  weighted 
by  the  applicable  percentage  of  storms  that  historically 
transitioned  to  that  category. 

Satellite  photographs  taken  at  30  minute 
intervals  are  available.  These  photographs  could  be 
processed  to  yield  position  reports.  The  use  of  more 
frequent  observations  might  increase  the  significance  of 
the  parameter  estimates  corresponding  to  the  24  hour  lag 
and/or  introduce  a  moving  average  parameter. 


In  this  study  the  data  were  only  lagged  up  to  six 
periods.  Coefficients  should  be  computed  at  longer  lags 
to  determine  if  there  are  other  significant  predictor 
variables.  Caution  must  be  exercised  when  lagging  the 
data  depending  on  the  size  of  the  grid  containing  the 
storm  track.  In  the  case  of  latitude  bands  five  degrees 
in  width,  storm  tracks  with  many  observations  are  usually 
associated  with  storms  moving  in  a  westerly  direction. 
Thus,  the  variables  beyond  lag  10  (or  so)  are  not  from  a 
representative  sample  of  hurricanes. 

Theoretical  Results 

The  bivariate  threshold  autoregressive  model 
used  in  this  research  represents  a  piecewise  1 inearization 
of  the  hurricane  movement  process.  It  is  slightly 
different  from  the  TARSO  model  developed  by  Tong  and  Lim 
(1980)  because  allows  a  contemporaneous  covariance 
structure  between  latitude  and  longitude. 

In  order  to  compute  forecast  confidence 
intervals,  it  was  necessary  to  determine  the  bivariate 
distribution  of  the  forecast  errors.  A  psi-weight  repre¬ 
sentation  of  the  univariate  and  bivariate  threshold  auto¬ 
regressive  process  was  used  to  show  that  the  distribution 
of  the  forcast  errors  was  bivariate  Normal.  This  allowed 


the  point  forecast  to  be  bounded  with  an  approximate 
confidence  ellipse. 

In  addition,  it  was  discovered  that  when  a  fore¬ 
cast  sequence  contained  a  threshold  crossing,  the  confi¬ 
dence  interval  calculation  required  that  the  psi-weight 
associated  with  the  first  forecast  in  the  new  region  be 
multiplied  by  all  the  previous  psi-weights.  This  resulted 
in  an  increase  in  the  size  of  the  confidence  region  that 
accounted  for  the  change  in  model  parameter  values  as  the 
threshold  was  crossed.  This  adjustment  was  shown  to  be 
necessary  in  the  univariate  and  bivariate  cases. 

Future  Theoretical  Development 

The  useful  theoretical  extensions  are  based  on 
the  possible  improvements  that  might  be  made  to  the 
forecasting  model.  If  it  is  possible  to  forecast  the 
hurricanes  by  using  a  weighted  combination  of  models,  the 
confidence  interval  calculations  will  have  to  be  changed 
accordingly.  If  more  frequent  observations  introduce  a 
moving  average  component,  it  would  be  a  theoretical  extension 
to  derive  the  confidence  interval  for  the  threshold  ARMA 
model.  This  extension  is  probably  most  worthwhile,  and 
might  also  be  easily  accomplished  using  a  psi-weight 
representation  of  the  forecast.  Finally,  a  prediction 
interval  that  accounts  for  the  variation  of  the  model 
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LAt_^  is  the  most  recent  latitude  report.  L0t_,  is  the  most 
recent  longitude  report.  Standard  deviations  or  parameter 
estimates  are  in  parentheses.  When  LAt  enters  a  new 
band,  the  appropriate  model  from  that  band  is  utilized. 

J 10-15N  J  LAt  =  1.696LAt_1-.  696LAt_2  +  .101 (LAt_4-LAt_5) +.048 
| 45-83W |  (.07)  (.07)  (.07)  (.04) 

LOt  =  .233 (LAt_1-LAt_2)-.075(LAt_4-LAt_5) 

(.11)  (.12) 

+ 1 . 607LOx._  I  -  .  607LOi._  2  +.251  (LCL.-LCL,- )  +  .127 
(.07)  (.07)  (.07)  (.07) 


|  15-20N | 
j 4  5- 8  7W | 


LAt  =  1.766LAt_1-.766LAt_2-.106 (LAt_4-LAt_5) 

(.03)  (.03)  (.04) 

-.010  (LOt_1-LO+._2)  +  .014  (LOt_4-LOt_5)  +  .053 
(.02)  "  (.02)  (.02) 
LOt  =  .012 (LAt_1-LAt_2)-.050(LAt_4-LAt_5) 

(.05)  (.06) 

+1.775LOt_1-.775LOt_2+.088 (LOt_4~LOt_5 ) +.139 
(.03)  (.03)  (.03)  (.04) 


|  20-2 5N  |  LAt 
j  45-100W | 


LO 


t 


=  1.849LA 
(.03) 

- .  0  6  4 ( LO 
(.03) 

=  .026 (LA 
(  .04) 
+1. 779LO 
(  .03) 


t_1-.849LAt_2+ .013 (LAt_4-LAt_5) 

(.03)  (.03) 

t-l~LOt-2) +*073 (LOt-4-LOt-5) +-054 
(.03)  (.02) 

t-l"LAt-2) “ ‘ 052 (LAt-4"LAt-5) 

(.04) 

t_x- . 779LOt_2+ .121 (LOt_4-LOt_5) + . 067 
(.03)  (.03)  (.03) 


|  25-30N  j  LA t  =  1 . 777LAt_x-.  777LAt_2+ .016 (LAt_4-LAt_5) 

1 45-100W j  (.04)  (.04)  (.04) 

-.101 (LOt_1-LOt_2) +.083 (LOt_4-LOt_5) + .071 
(.03)  (.03)  (.03) 

LOfc  =  . 103 (LAfc_1-LAt_2) - . 198 (LAt_4-LAt_5) 

(.05)  (.05) 

+  1.8  3  7LO1__1  -.8  3  7LOt_2  +  .03  2  (L0t_4-L0i-_S)  +  .052 
(.04)  (.04)  (.04)  (.03) 


30-35N 

45-80W 


LAf  =  1 . 74  6LAf_  i  - .  74  6LA*.  _  p  -  .  0 1  5  ( LA*.  _  4  -LA  t-5  ' 

(.06)  (.06)  (.06) 

- .049 (LOt_1-LOt_2) - -Oil (LOt_4-LOt_5) + . 072 
(.05)  (.05)  (.04) 

LOt  =  . 050 (LAt_1-LAfc_2) -. 030 (LAt_4-LAt_5) 

(.07)  (.07) 

+  1 . 841LOt_]_  -  .  84lLOt_2+  .  06  8  ( LOt_  4"LOt_  5  )  -  .04  4 
(.06)  (.06)  (.06)  (.05 

LA t  =  1.735LAt_1-.735LAt_2-.094 (LAt_4-LAt_5) 

(.06)  (.06)  (.06) 

-.075 (LOt_1-LOt_2) +  .  023 (LOt_4-LOt_5) + . 087 
(.04)  (.04)  (.05 

LOt  =  .030 (LAt_1-LAt_2) -.005 (LAt_4-LAt_5) 

(.08)  (.06) 

+1 .881LOt_1-.881LOt_2-.006 (LOt_4-LOt_5 ) "• 
(.05)  (.05)  (.08)  ( 

LAt  =  1.742LAt_1-.742LAt_2+.013 (LAt_4-LAt_5) 

(.10)  (.10)  (.10) 

-.067 (LOt_1-LOt_2) "•  00  3 ( LOt_4-LOt_5 ) -  -  0 8 6 
(.08)  (.08)  (.09 

LOt  =  -.145 (LAt_1-LAt_2) +.125 (LAt_4-LAt-5> 

(.12)  (.13) 

+  1 . 831LOt_1-.831LOt_2-.04  2  (L04._/,-L04._c)  -. 
.10)  (.10)  (.12 


APPENDIX  B 
PSI  WEIGHT  AND  COVARIANCE  MATRICES 
The  psi-weight  matrices  are  listed  from  through  ^ ^ , 


10-15N 

45-83W 


1.00000 

0.00000 

1 .69589 

-.00041 

2.18006 

-.00094 

0.00000 

1.00000 

.23301 

1.60668 

.53652 

1.97465 

2.51686 

-.00147 

2.85208 

-.00222 

3.15546 

-.00313 

.83347 

2.19776 

1.01723 

2 .58356 

1.21311 

2.96590 

3.41529 

-.00408 

3.62993 

-.00500 

3.81298 

-.00593 

1.44244 

3.29566 

1.69130 

3.54926 

1.91324 

3.79963 

3.97084 

-.00689 

4.10678 

-.0078t, 

4.22289 

-.00879 

2.11691 

4.04808 

2.31526 

4.28038 

2.5136 

4.48470 

The  psi-weights  are  followed  by  the  covariance  matrix  of 
the  one  step  ahead  forecast  error  computed  using  all 
storms  in  the  region. 

75.6900  2.4420 

2.4420  201.3561 

15-20N 

45-87W 


1.00000 

0.00000 

1.76569 

-.00960 

2.35186 

-.02439 

0.00000 

1.00000 

.01206 

1.77455 

.03064 

2.37436 

2.30050 

-.04147 

3.24963 

-.04534 

3.67499 

-.04304 

. 05209 

2.83877 

.02382 

3.28640 

-.03011 

3.70181 

4.06349 

-.03864 

4.40934 

-.03434 

4.72193 

-.02853 

-.09460 

4.07719 

-.16054 

4.40979 

-.23253 

4.70710 

5.00624 

-.02102 

5.26492 

-.01224 

5.49947 

-.00282 

-.31066 

4.97395 

-.39298 

5.21358 

-.47683 

5.42840 

125.2161  -2.9775 

-2.9775  257.6025 
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20-25N 

45-100W 


1.00000 

0.00000 

1.84867 

-.06432 

2.56723 

-.16898 

0.00000 

1.00000 

.02610 

1.77858 

.06857 

2.38309 

3.17432 

-.29669 

3.69960 

-.36173 

4.15842 

-.39159 

.12039 

2.85101 

.12420 

3.33268 

.09955 

3.80332 

4.56213 

-.40421 

4.91924 

-.41083 

5.23258 

-.40822 

.05982 

4.24724 

.01387 

4.65601 

-.03964 

5.04549 

5.50628 

-.39626 

5.74493 

-.37624 

5.95312 

-.35090 

-.10014 

5.38938 

-.16604 

5.71947 

-.23538 

6.02665 

143.2809  6.1945 

6.1945  179.5600 

I  25-30N 
45-100W 


1.00000 

0.00000 

1.77698 

-.10098 

2.37029 

0.00000 

1.00000 

.10287 

1.83734 

.26894 

2.81451 

-.46040 

3.15533 

-.58649 

3.43948 

.46902 

3.08971 

.48432 

3.57154 

.38166 

3.69387 

-.71528 

3.93228 

-.75231 

4.14408 

.21277 

4.42070 

.01594 

4.81731 

-.19133 

4.32554 

-.80898 

4.47784 

-.82946 

4.60518 

-.40259 

5.52073 

-.61655 

5.82149 

-.83347 

212.5764 

-28.9362 

-28.9362 

307.3009 

30- 

35N 

45- 

80W 

1.00000 

0.00000 

1.74583 

-.04904 

2.29962 

0.00000 

1.00000 

.05037 

1.84062 

.13028 

2.70874 

-.21940 

2.99394 

-.  32865 

3.19124 

.22535 

3.13281 

.29555 

3.69010 

.34972 

3.32638 

-.56639 

3.41767 

-.68491 

3.47875 

.39382 

4.69368 

.43171 

5.13563 

.46427 

-.26399 

2.52809 

-.66503 

4.00860 

-.78303 

5.18586 

-.84333 

6.09115 


3.51908  -.90931  3.54521  -1.01312  3.56167 

.49240  5.91698  .51696  6.26278  .53871 


-.12684 

2.54479 

-.44627 

4.21140 

-.79966 

5.54218 

-1.11073 

6.58173 


270.6025 

-63.7864 

-63.7864 

390.0625 

35- 

4  ON 

45- 

76W 

1.00000 

0.00000 

1.73508 

-.07490 

2.27314 

-.19593 

0.00000 

1.00000 

.03048 

1.88079 

.07973 

2.65430 

2.66497 

-.34283 

2.85457 

-.47828 

2.92116 

-.59419 

.13951 

3.33191 

.19889 

3.91796 

.25294 

4.42487 

2.91666 

-.68797 

2.87445 

-.76019 

2.82412 

-.81537 

.29945 

4.86357 

.33787 

5.24361 

.36905 

5.57316 

2.77979 

-.85792 

2.74682 

-.89159 

2.72590 

-.91930 

.39429 

5.85916 

.41490 

6.10749 

.43203 

6.32317 

336.3556 

-11.8191 

-11.8191 

428.9041 

40 

-45N 

45 

-70W 

1.00000 

0.00000 

1.74183 

-.06665 

2.30179 

-.17149 

0.00000 

1.00000 

-.14469 

1.83112 

-.37288 

2.53152 

2.73235 

-.29594 

3.08236 

-.43071 

3.36253 

-.56523 

-.64245 

3.12881 

-.80439 

3.60144 

-.89091 

3.97068 

3.58379 

-.69279 

3.75626 

-.80950 

3.88867 

-.91328 

-.92390 

4.25466 

-.91826 

4.46862 

-.88804 

4.62674 

3.98864 

-1.00347 

4.06270 

-1.08035 

4.11637 

-1.14476 

-.84346 

4.74095 

-.79187 

4.82111 

-.73839 

4.87533 

367.8724 

-68.7230 

-68.7230 

538.7041 

APPENDIX  C 

BIVARIATE  THRESHOLD  CONFIDENCE  ELLIPSE  EXAMPLE 
(Hurricane  Frederic,  1979) 

This  example  illustrates  use  of  the  forecast 

model  parameters  (Appendix  A)  and  the  psi  weights  and 

covariance  matrices  (Appendix  B)  to  compute  a  confidence 

ellipse  for  a  forecast  that  crosses  a  threshold  boundary. 

The  positions  of  interest  for  hurricane  Frederic  are 

t  (12.0,45.1)  (12.5,47.0)  (12.9,48.7)  (13.3,50.4) 

(13.8,52.3)  (13.3,54.1)  (14.9,55.5)  (15.5,57.2) 

(16.3.58.8)  (16.7,59.8)  (17.1,60.8)  (17.5,61.8) 

(17.8.62.8)  (18.0,63.8)  (18.1,64.8)  (18.1,65.8) 

(18.1.66.8)  ]  . 

From  time  origin  5  (13.8,52.3),  the  models  in  Appendix  A 

yield  forecasts  for  lead  times  of  6  through  48  hours: 

[  (14.2,54.1)  (14.6,55.9)  (15.1,57.6)  (15.4,59.1) 

(15.8,60.6)  (16.2,62.1)  (16.6,63.5)  (17.1,64.8)  }. 

The  90%  confidence  ellipse  for  the  48  hour  forecast  can  be 

readily  constructed.  From  the  model  for  10-15N, 

LAt  =  1.696LAt_1-.696LAt_2+.101 (LAt_4-LAt_5) +.048 

LOt  =  .233(LAt_1-LAt_2)~.075  (LAt_4-LAt_5) 

+1.607LOt_1-.607LOt_2+-251 (LOt_4-LOt_5 ) +.127 

and  for  15-20N, 

LA^  =  1 . 7 6 6LA^- .  7 6 6LA-(- _ 2- .  1 0 6  ( LA^- _ 4_LA^-_ 5 ) 

-.010  (LOj-_i  -LOt_?)  +  .014  (LO*._4-LOt_5)  +  -053 
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LOt  =  .012  (LAt_i -LAt_2) -.050  (LAt_4-LAt-5) 

♦1.775LOt_1-.775LOt_2+.088 ( LOt_4-LOt_5 ) + . 1 39 
These  models  can  be  specified  in  terms  of  the  noise  series 
alt'  a2t'  nlt'  an<^  n2t  w^ich  represent  the  latitude  and 
longitude  shocks  in  the  region  10-15N,  and  the  latitude 
and  longitude  shocks  in  the  region  15-20N  respectively. 
First  represent  the  models  in  terms  of  the  $  (B) 
polynomials.  The  polynomial  for  10-15N  is 

LAt  =  (1.696B-.696B2+.101B4-.101B5)LAt  +  0 . OLOt  +  alt  (Cl) 
LOt  =  (.233B-.233B2-.075B4+.075B5)LAt 

+ (1.607B-.607B2+.251B4-.251B5)LOt  +  a2t  (C2) 

and  for  15-20N, 

LAt  =  (1.766B-.766B2+.106B4-.106B5)LAt 

+ (-.010B+.010B2+.014B4-.014B5)LOt  +  nlt  (C3) 

LOt  =  (.012B-.012B2-.050B4+.050B5)LAt 

+ (1.775B-.775B2+.088B4-.088B5)LOt  +  n2t  {C4) 

Next  solve  (C3)  and  (C 4 )  for  LAt  and  LOt  in  terms  of  the 
noise  to  yield,  for  15-20N 

LAt  =  a/b 

where  a  =  (  (1-1.775B  +  .77  5B2- . 0  8  8 B4  + . 0 8  8B5 )  a  1 1 

+ (-.010B+.010B2+.014B4-.014B5)a2t  } 
and  b  =  C ( 1-1 . 766B+ . 766B2- . 106B4+ . 106B5 ) 

(1-1.775B+.775B2-.088B4+.088B5) 

- (-. 010B+. 010B2+ .014B4-.014B5) 

(.012B-.01B2-.050B4+.050B5) )  .  (C5) 
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LOt  =  c/b 

where  c  =  ( . 012B- . 0 12B2- . 050B4+ . 050B5 ) a1 t 

+(1-1.766B+.766b2— .106B4+.106B^)a2£  .  (C6) 

Similarly,  for  10-15N, 

LAfc  =  d/e 

where  d  =  [  ( 1-1 . 607B+ . 607B2- . 251B4+ . 251B5 ) alt  +  0.0a2t  3 
and  e  =  { (1-1 . 696B+ . 696B2-. 101B4+ . 101B5) 

(1-1.607B+.607B2-.251B4+.251B5) 

-  (0)  (  .233B-.  233B2-.075B4+.075B5)  }  .  (C7) 

LOt  =  f/e 

where  f  =  ( . 233B- . 233B2- . 075B4+ . 075B5 ) a1 t 

+ (l-1.696B+.696B2-.101B4+.101B5)a2t  .  (C8) 

The  ratios  in  (C 7 )  and  (C8)  are  the  psi  weight  matrices 
for  the  region  10-15N.  These  2  by  2  matrices  are  denoted 
,  where  i  denotes  the  lag  from  the  forecast  position. 
Similarly,  the  ratios  in  (C5)  and  (C6)  yield  the  psi 
weights  for  the  region  15-20N.  These  are  also  2  by  2 
matrices  are  denoted  <|)^.  Then,  following  the  development 
in  the  section  on  Threshold  Interval  Forecast  of  Two 
Variables , 


1  1.000 

0.000 

*1  = 

1.696 
_  .233 

0.000 

’i'2- 

[  2.180 

0.000 

1 _ 0 .000 

1.000_ 

1 . 607 _ 

L  -537 

i  .  975_ 

1  1.000 

o.ooo' 

?1  = 

~ 1.766 

-.01 0  1 

?2  = 

2 .352 

-.02  4 

l_o.ouo 

1 .000_ 

_  .012 

1 . 77  5 _ 

_  .031 

2.37  4 _ 

1  2 .801 
|_  .052 

- .  04  2 

C  4  = 

_ 3 .250 

- .  04  5 

6  5  = 

3 .675 

- .  04  3 

2 . 8  39 _ 

.024 

3 . 2  86 _ 

_ - .030 

3.70  2 _ 

The  covariance  matrix  of  the  one  step  ahead  forecast 
errors  for  10-15N  is 


;a  =  |  75.69  -2.4  4  ] 

L_-2-44  201 . 3  6 _ | 


For  15-20N, 


125.22  -2.98 

|_  -2.98  257 . 60 _ 


(4.40)  , 

and  (4.41) , 

C11  c  2 1 

=  ?  5^o  = 

3.675 

-.043 

dll  d21 

-.030 

3.702 

C1 2  c2 2 

=  = 

6.222 

-.071" 

d12  d22 

.812 

5.948 

C1 3  c23 

=  ?5*2  = 

7.989  - . 0 8 8  ^ 

I  1 

d13  d23 

1.921 

7.310 

(C9 ) 


(CIO) 


(Cll) 


Summing  the  squares  of  the  elements  of  the  matrices  in  (C 9 ) 
(CIO),  and  (Cll)  yields, 

""116.043  .0147' 

4.350  102.520 


rcll2+c122+c132  c212+c222+c232-1  ^ 


dll  +d12  +d13' 


+  d00  +d 


‘21  22 


23 


2  2 

These  are  the  multipliers  for  aa^  ,  and  oa2  in  equation 
(4.39)  and  (4.40).  The  multipliers  for  the  cov(aia2)  ^n 
(4.39)  are  given  by 


C1  lc21  +  c12c22  +  c13c23  ~  1,306 


and  for  cov(a^a 2^  ln  (4*40)/ 


dlld21+d12d22+d13d23 


18.761 


For  £n,  the  multipliers  are  the  £  weights.  Specifically, 
for  (4.39) ,  the  sum 

c142+  •••  +c182  =  (3.250) 2+  (2.801)  2+ (2.353)  2  +  (1.766) 2+  (1) : 

=  28.082 

is  the  coefficient  of  anl2.  The  sum 

c242+  •••  +c282  =  (-.045) 2+ (-.042) 2+ (-.025) 2+ (-.012) 2+ (0) 

=  .00456 

is  the  coefficient  of  a  r]22  in  (4.39),  and, 

c14c24+  ***  +C1 8C2  8  =  -*340 
is  the  coefficient  of  cov^^r^)  in  (4.39). 

For  (4.40),  anl2  is  multiplied  by 
d142+  +dlg2  =  ( .024) 2+ ( .052)  2+ ( .031)  2+ (  .012) 2+ (0) 2 

=  .00439 

and  a  -,2  is  multiplied  by 
n  ^ 

d242+  ***  +d282  -  28.644 
Then  cov(n4n2)  is  multiplied  by 

d14d24+  +dl 8d2  8  =  *321 

The  coefficients  of  aal 2,  cov(aia2),  °a22'  %l2, 

cov(n1n2),  and  CTn22  in  equation  (4.41)  are  20.289, 
108.787,  -1.228,  .318,  28.348,  and  -.344  .  So,  from 
(4.39)  the  variance  of  the  8  step  ahead  latitude  forecast 
from  time  origin  5  with  2  forecasts  falling  in  the  old 
region  is 


V  [e15  (  8 , 2) ]  =  (116.043)aal2+(.0147)oa22  +  2(-1.306)cov(a1a2) 
+  (28.082) anl2+ ( .00456) an22+2 (-.340)cov(nin2) 

=  (116. 21)  (75. 69) +  (.0147)  (201. 36) +  2  (-1.306)  (-2. 44) 

+  (28.082)  (125.22) +  (.00456)  (257.60) +2  (-.340)  (-2.98) 

=  12295.427  . 

Thus,  the  standard  deviation  of  the  8  step  ahead  forecast 
of  latitude  is 

SD[e15(8,2)]  =  (12295.427) 1/2  =  110.9  n  mi  . 

This  compares  well  with  the  empirical  standard  deviation 
of  102.8  n  mi. 

From  (4.40)  the  variance  of  the  eight  step  ahead 
forecast  of  longitude  is 

V[e25  (8, 2)  1=4.350(75.69)  +102. 520  (201. 36)  +  2  (18. 761)  (-2. 4  4) 

+  .004 39  (125. 22) +28. 644  (25 7. 60) +2  (.321)  (-2. 98) 

=  28439.717 

and 

[©25(8,2)]  =  168.6  n  mi 

which  compares  favorably  with  the  empirical  value  of  149.7 
n  mi .  From  (4.41)  , 

E[e15 (8,2)e25  (8,2) ]  =  20 . 2 89 ( 75 . 69 ) + 108 . 787 ( -2 . 44 ) 

+(-1.228) (201.36) +.318 (125.22) 

+  28.348  (-2.98)  +  (-3.44)  (257.60) 

=  1421.0 

Conseguently,  the  covariance  matrix  for  the  eight  step 


ahead  forecast  is 
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*8 


12324.9  1421.0 

1421.0  28258.5 


The  confidence  ellipse  can  be  specified  by  using  the 
procedure  described  in  Interval  Forecast  of  Two  Variables. 
The  eigenvalues  for  Zg  are  A^=  12171.30,  and  X2 
=  28563.84,  and  the  eigenvectors  are 


<.087,.996>  <.996,-.087>  . 

The  major  axis  has  length 

2  (  X  2 )  ' 5  (1.645)  =  556.0  n  mi. 

The  minor  axis  has  length 

2U1)  *  5  ( 1 . 64  5 )  =  377.5  n  mi. 

The  minor  axis  lies  on  a  magnetic  heading  of 

360  +  tan-1  (  . 087/ . 996 )  =  4.99  degrees 
and  is  centered  at  the  point  forecast  (17.1,  64.8)  . 

Finally,  the  equation  of  the  90%  confidence  ellipse  is 
given  by 

(Xjl/168.6)  2~2  (  .076)  (x1/168.6)  (x2/ 110 . 9 )  +  (x2/ HO  .  9  )  2 
=  1.645  [1- (  .076) 2] 

where  x^  represents  the  east-west  direction  and  x2 
represents  the  north-south  direction  and 
p  =  .076  =  [ (1421.0) (12295. 4 ) " • 5 ( 2 8439 . 7 ) " • 5 ]  . 
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ANALYSIS  OF  RESIDUALS 

Chi-square  contingency  table  tests  (99% 
confidence)  were  used  to  evaluate  the  independence  of  the 
latitude  and  longitude  one  step  ahead  forecasts.  The 
tests  rejected  the  null  hypothesis  of  independence  which 
was  acceptable  because  a  dependency  structure  was  allowed 
However,  chi-square  goodness  of  fit  tests  also  rejected 
the  null  hypothesis  that  the  forecasts  were  Normally 
distributed  which  was  not  acceptable. 

The  region  posessing  the  worst  chi-square 
value  (20-25N)  was  examined  in  detail.  Further  goodness 
of  fit  tests  did  not  reject  the  double  exponential 
distribution  for  both  marginals.  This  indicated  the 
category  five  forecast  distributions  had  heavier  tails 
than  expected.  Outliers  were  examined  case  by  case.  It 
was  discovered  that  a  few  storms  that  were  classified  as 
category  five  had  short  non-category  five  track  segments. 
That  is,  the  storms  accelerated  during  the  time  they  were 
supposed  to  be  moving  at  constant  velocity.  The  segments 
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containing  the  accelerations  were  removed  and  the  category 
five  model  was  then  fit  to  the  new  data.  New  coefficients 
were  not  significantly  different  from  old  coefficients 
(Table  Dl).  Mean  forecast  accuracy  was  not  significantly 

TABLE  Dl 

OLD  VS  NEW  COEFFICIENT  MATRICES 


Old 

coeff 

icients 

New  Coeff 

icient 

LA 

LO 

LA 

LO 

Lag  1 

LA  | 

|  -  8  4 

-  .06  I 

|  .81  ■ 

-  -  08  I 

LO 

_ .03 

•  ?  8 _ 1 

| _ .05 

.  7  6 _ 1 

LA 

LO 

LA 

LO 

Lag  4 

LA 

.01 

.07H 

r  .04 

-  08  1 

LO 

-.05 

.  12 _ | 

1 _ -  .  09 

•12J 

different.  Deleting  the  track  segments  removed 
approximately  4%  of  the  forecasts  from  each  tail  of  the 
forecast  error  distributions.  All  the  deleted  forecasts 
were  beyond  three  standard  deviations  from  the  mean. 

A  second  set  of  chi-square  tests  did  not  reject 
the  null  hypothesis  of  Normally  distributed  errors.  In 
addition,  the  contingency  table  tests  did  not  reject  the 
null  hypothesis  of  independence. 

Outliers  beyond  three  standard  deviations  (at 
most  4%  from  each  tail)  were  deleted  from  the  one  step 
ahead  forecast  errors  in  other  regions.  Chi-square  tests 
did  not  reject  independence  or  Normality. 
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Sequential  regression  analysis  of  lagged 
residuals  showed  no  significant  auto  or  cross-correlations 
at  lags  one  through  four. 

To  afford  additional  protection  when  forecasting 
accelerating  storms,  the  empirical  one  step  ahead  forecast 
error  covariance  matrices  based  on  all  storms  (Appendix  B) 
are  used  in  confidence  ellipse  computations. 
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TRANSITION  MATRICES 

A  representative  analysis  of  the  four  step  ahead 
transition  matrix  for  the  region  25-30N  is  presented  in 
this  appendix.  The  heuristic  procedure  used  to  develop 
the  matrix  involved  moving  a  window  of  seven  previous  six 
hour  position  reports  along  each  hurricane  track.  Seg¬ 
ments  of  longer  length  resulted  in  large  estimated  parame 
ter  variances,  while  shorter  segments  missed  short  term 
accelerations.  At  each  position  report  a  stationarity 
category  was  determined.  For  example,  at  the  seventh 
position  report,  the  stationarity  category  was  based  on 
positions  one  through  seven.  At  the  eighth  position  re¬ 
port  the  stationarity  category  was  based  on  positions  two 
through  eight,  etc..  In  this  manner,  a  stationarity  cate' 
gory  was  associated  with  the  terminal  point  in  each 
"window."  The  transition  matrix  was  then  constructed 
based  on  those  categories. 

Some  difficulty  was  encountered  in  determining 
the  storm  categories  primarily  due  to  the  small  window. 
With  seven  observations,  the  standard  deviations  of  the 
parameter  estimates  were  approximately  .1  .  If  the  storm 


was  in  category  nine,  the  series  had  to  be  differenced 
twice  for  stationarity.  This  resulted  in  five 


observations  being  available  to  compute  the  lag  one 
parameter . 

To  sort  the  storms  into  the  various  categories, 
the  lag  one  parameter  was  calculated  for  each  latitude  and 
longitude  series.  The  raw  position  series  was  analyzed 
first.  By  trial  and  error  it  was  determined  that 
coefficient  values  below  .8  were  good  indicators  of  mean 
stationarity.  If  the  lag  one  parameter  was  greater  than 
.8  (two  standard  deviations  below  1.0)  ,  the  series  was 
differenced  and  the  lag  one  parameter  was  calculated  for 
the  velocity  model.  If  the  parameter  was  still  greater 
than  .8,  the  series  was  differenced  again  and  the  parame¬ 
ter  was  calculated  for  the  acceleration  model.  Usually, 
at  some  point  in  the  procedure,  the  series  were  found  to 
be  stationary.  In  a  very  few  cases  (less  than  1%)  the 
series  were  always  nonstationary.  Rather  than  difference 
the  series  a  third  time,  these  cases  were  discarded. 

Given  that  the  storm  is  in  a  particular  category 
at  time  t,  the  category  at  time  t  +  1,  t+2,  ...  ,  can  be 
determined.  By  recording  the  results  for  all  storm 
tracks  it  is  possible,  given  the  current  category,  to 
determine  the  probability  the  storm  will  transition  to 
some  other  category  at  a  future  time.  The  result  of  the 
analysis  is  a  transition  matrix  (Table  E.l).  For  example, 
given  that  the  storm  is  presently  in  category  2,  there  is 
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REQUIRED  DATA  MANIPULATION 
The  data  files  and  the  five  major  steps  required 
to  calculate  the  model  coefficients  for  each  latitude  band 
are  described  in  the  following  pages.  The  five  steps  are: 
data  reduction,  determination  of  the  stationarity 
category,  construction  of  the  data  matrix,  coefficient 
computation,  and  analysis  of  empirical  forecast  error. 

"Best  track"  storm  data  were  provided  via  magnet¬ 
ic  tape  by  the  National  Environmental  Satellite  Data  and 
Information  Service  (an  agency  of  the  National  Oceanic  and 
Atmospheric  Administration),  Asheville,  North  Carolina. 

The  data  set  contained  position  reports  at  6  hour 
intervals  for  815  North  Atlantic  storms  (including  hurri¬ 
canes,  tropical  storms,  and  subtropical  storms)  dating 
from  1886  through  1983.  The  tape  format  had  80  characters 
per  record.  There  were  three  types  of  card  images:  the 
Title  Card  (Table  F.l) ,  the  Storm  Classification  Card 
(Table  F.2)  ,  and  the  Storm  Data  Cards  (Table  F.3) 

(Jarvinen,  Neumann,  and  Davis,  1984). 

In  each  region,  the  first  step  was  to  condense 
the  data  file  by  eliminating  data  that  was  not  required. 
Based  on  recommendations  from  the  analysts  at  the  National 
Hurricane  Center,  all  storms  occurring  before  1945  were 
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TABLE  P.l 


TITLE  CARD  -  FORMAT  AND  CONTENTS 


Computer  Card  Columns 


Contents 


1-5 
7-8 
10  -  11 
13  -  16 
20  -  21 

23  -  24 
25  -  30 
31  -  34 
36  -  47 
48  -  52 
53 

54  -  58 
59 

60  -  79 
80 


Card  sequence  number 
Month 

Day  (first  day  of  storm) 

Year 

Value  of  M  (M=number  of  days  the 
storm  existed) 

Storm  number  for  the  year 
Blank 

Cumulative  Storm  Number 

Storm  Name 

Blank 

(l=hit  coastline,  0=did  not) 
Blank 

Saf fir/Simpson  Hurricane  Scale 

number 

Blank 

Last  storm  of  the  year  if  =  l 


TABLE  F . 2 

CLASSIFICATION  CARD  -  FORMAT  AND  CONTENTS 
Computer  Card  Columns  Contents 


1 

7 


5 

8 


Card  sequence  number 
Maximum  status  of  the  storm 
during  its  life 


TABLE  F . 3 

STORM  DATA  CARD  -  FORMAT  AND  CONTENTS 

Latitude  and  longitude  are  rounded  to  the  nearest 
tenth.  Wind  speed  is  rounded  to  the  nearest  five  knots. 
Pressure  is  rounded  to  the  nearest  millibar.  Storm  types 
are:  '*'-tropical  storm  or  hurricane,  'D'-  tropical 
disturbance,  'S'-subtropical  storm,  'W'-tropical  wave,  and 
' E ' -extratropical  storm. 


Computer  Card  Columns 


Contents 


1 

-  5 

7 

-  8 

0 

-  11 

12 

3 

-  15 

6 

-  19 

20 

1 

-  23 

24 

Card  sequence  number 

Month 

Day 

Storm  type  at  0000Z 
Latitude  at  0000Z 
Longitude  at  0000Z 
Blank 

Wind  speed  at  0000Z 
Blank 

Central  pressure  at  0000Z 
Storm  type  at  0600Z 
Latitude  at  0600Z 
Longitude  at  0600Z 
Blank 

Wind  speed  at  0600Z 
Blank 

Central  pressure  at  0600Z 
Storm  type  at  1200Z 
Latitude  at  1200Z 
Longitude  at  1200Z 
Blank 

Wind  speed  at  1200Z 
Blank 

Central  pressure  at  1200Z 
Storm  type  at  1800Z 
Latitude  at  1800Z 
Longitude  at  1800Z 
Blank 

Wind  speed  at  1800Z 
Blank 

Central  pressure  at  1800Z 


deleted  due  to  concerns  about  the  accuracy  of  the 
observations.  All  storms  that  attained  less  than  tropical 
storm  status  (12  of  them)  were  deleted  because  of  their 
weak  persistence.  Next,  all  storm  tracks  outside  the 
region  of  interest  were  deleted.  Finally,  because  the 
focus  of  the  research  was  to  accurately  predict  hurricane 
landfall  based  on  the  past  track,  position  reports 
following  landfall  on  the  continental  United  States  were 
eliminated  as  were  the  central  pressures  and  wind 
velocities.  This  left  362  storm  tracks  containing  a  total 
of  over  10,000  position  reports.  The  final  condensed  data 
file  contained  the  cumulative  storm  number,  the  year, 
month,  and  day  o*  the  first  position  report,  the  storm 
name,  the  number  of  position  reports,  and  the  latitude  and 
longitude  coordinates  of  each  position  report  (Table  F.4). 

In  the  second  step,  the  following  procedure  was 
used  to  determine  the  stationarity  category  of  the  storm. 
For  latitude, 

(1)  Adjust  the  latitude  series  to  zero  mean. 

(2)  Lag  the  data  one  period. 

(3)  Use  least  squares  regression  to  calculate  the  lag 
one  position  forecast  coefficient. 

(4)  If  the  coefficient  is  less  than  .8  (the  storm  is 
latitude  position-stationary)  go  to  (12). 

(5)  Difference  the  series  the  first  time. 

(6)  Use  least  squares  regression  to  calculate  the  lag 
one  velocity  forecast  coefficient. 


(7)  If  the  coefficient  is  less  than  .8  (the  storm  is 

latitude  velocity-stationary)  go  to  (12)  . 

(8)  Difference  the  series  a  second  time. 

(9)  Use  least  squares  regression  to  calculate  the  lag 

one  acceleration  forecast  coefficient. 

(10)  If  the  coefficient  is  less  than  .8  (the  storm  is 

latitude  acceleration-stationary)  go  to  (12). 

(11)  Discard  the  storm.  Go  to  (14). 

(12)  Perform  steps  ( 1 )  —  (11)  for  longitude. 

(13)  Classify  the  storm  according  to  Table  4.1. 

(14)  Read  the  next  storm  track. 

Various  stationarity  "cutoffs"  were  examined  from 
<(>  =  .6  through  <(>=1.2.  The  value  of  ((>=.8  was  selected 
because  it  resulted  in  the  minimum  forecast  errors  and 
because,  for  individual  storms,  the  standard  deviation  of 
the  lag  one  coefficient  was  approximately  .1.  Thus, 
coefficient  values  below  .8  were  good  indicators  of  mean 
stationarity.  Once  the  stationarity  category  was  deter¬ 
mined,  it  was  appended  to  the  storm  name.  A  two  storm 
sample  of  the  resulting  data  file  is  shown  in  Table  F.4. 

The  data  matrices  were  then  constructed.  There 


are  nine  data  matrices  for  each  grid,  one  for  each  of  the 


in  Table  4.1. 
data  matrices 


For  example,  the  category 
for  the  storms  in  Table  F.4 


are  shown  in  Tables  F.5  and  F.6. 


TABLE  F . 4 


CONDENSED  DATA  FILE  AND  SAMPLE  RECORDS 
Title  Card  Columns  Contents 


1  - 

2 

Blank 

3  - 

6 

Cumulative  sequence  number 

7 

Blank 

8  - 

11 

Year 

12 

Blank 

13  - 

14 

Month 

15 

Blank 

16  - 

17 

Day  of  first  position  report 

18 

Blank 

19  - 

33 

Name 

34 

Blank 

35  - 

36 

Number  of  position  reports 

37 

Blank 

38  - 

39 

Maximum  storm  type  attained 

40  - 

41 

Blank 

42 

Stationarity  category 

tion 

Card  Columns 

1 

Blank 

2  - 

5 

Latitude 

6 

Blank 

7  - 

10 

Longitude 

Sample  Records  For  Grid  25-30N 

690  1970  07  31  CELIA  08  TS  5 

25.3  89.6 

25.8  90.8 

26.2  92.0 
26.6  93.5 
27.0  94.9 

27.5  96.3 

28.1  97.8 

28.6  99.3 

694  1970  09  12  FELICE  08  HR  5 

25.3  84.0 

25.8  85.2 
26.5  86.5 

27.2  88.4 
28.0  90.2 

28.8  92.2 

29.4  94.1 

29.9  95.5 


«.   V.  •  «  1  ~  » 


V.  ■ 


TABLE  F .  5 


CATEGORY  1  REGRESSION  DATA  MATRIX 
(For  records  in  Table  F.4) 

This  is  position  data  for  Celia  and  Felice  adjusted  to 
zero  mean.  This  example  is  mean  nonstationary.  Missing 
data  are  denoted  by  an  'X'.  Actual  data  matrices  were 
lagged  up  to  six  periods.  This  example  is  lagged  four 
periods  for  clarity  of  presentation.  The  dependent 
variables  are  at  lagO.  The  predictor  variables  are  at 
lagl  through  lag4.  This  is  only  an  example  of  the 
category  1  format.  Celia  and  Felice  are  category  5 
storms . 

Latitude  Longitude 

lagO  lagl  lag2  lag3  lag4  lagO  lagl  lag2  lag3  lag4 


TABLE  F . 6 


CATEGORY  5  REGRESSION  DATA  MATRIX 
(For  records  in  Table  F.4) 

This  is  velocity  data  for  Celia  and  Felice.  This  example 
is  mean  stationary.  Missing  data  are  denoted  by  an  'X'. 
Actual  data  matrices  were  lagged  up  to  five  periods.  Thi 
example  is  lagged  four  periods  for  clarity  of 
presentation.  The  dependent  variables  are  at  lagO.  The 
predictor  variables  are  at  lagl  through  lag4.  These  are 
the  actual  data  matrices  used  for  Celia  and  Felice. 


Latitude 

lagO  lagl  lag2  lag3  lag4 


Longitude 

lagO  lagl  lag2  lag3  lag4 


In  this  research  only  the  forecast  models  for  the  category 
five  storms  were  developed.  A  portion  of  the  category 
five  data  matrix  for  25-30N  latitude  is  shown  in  Table 
F.6.  The  data  in  Table  F.6  represent  velocities  in  units 
of  degrees  of  displacement  per  six  hour  interval. 

In  the  fourth  step,  the  lag  zero  latitude  and 
longitude  columns  were  treated  as  dependent  variables  and 
a  least  squeVes  regression  was  performed  using  SPSS.  The 
resulting  coefficients  are  in  Appendix  A. 

Finally,  in  step  five,  the  models  were  used  to 
forecast  the  storms,  and  the  forecast  error  was  analyzed. 
Initially  there  was  concern  over  the  fact  that  the 
hurricanes  being  forecast  were  the  same  ones  used  to 
develop  the  model  coefficients.  To  analyze  the  effect  of 
this  "unfair"  advantage,  in  a  sample  region,  the 
storms  were  deleted  (one  at  a  time)  from  the  data  base, 
and  model  coefficients  were  computed  and  used  to  forecast 
only  the  excluded  storm.  There  was  no  significant  change 
in  forecast  accuracy.  For  example,  for  the  48  hour 
forecast,  the  error  changed  by  2  n  mi. 

It  was  concluded  that  the  large  number  of  obser¬ 
vations  in  each  region  tended  to  diminish  the  contribution 
of  individual  storms.  This  allowed  computation  of  model 
coefficients  in  approximately  90  seconds  (per  region)  on  a 
CDC  Dual  Cyber  170/750.  Had  it  been  necessary  to  delete 


individual  storms  prior  to  computing  forecast  coefficients 
it  is  estimated  that  run  times  would  have  increased  by  a 
factor  of  1000.  A  separate  run  would  have  been  required 
for  each  storm  in  each  region.  Most  storms  had  track 
segments  in  more  than  one  region. 

Once  the  model  coefficients  are  computed, 
forecasting  can  be  accomplished  rapidly.  Complete 
forecasts  and  error  analysis  for  up  to  72  hours  in  the 
future  requires  approximately  17  seconds  of  CDC  processor 
time  per  region.  This  includes  the  analysis  of 
approximately  7700  forecasts  per  region.  For  individual 
storms,  forecasts  with  lead  times  as  large  as  72  hours  in 
the  future  (without  error  analysis)  can  be  accomplished  on 
an  Apple  lie  micro-computer  in  approximately  four  seconds. 
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